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Summary 


Overview 

This  document  is  a preliminary  report  on  one  aspect  of  the 
initial  phase  of  a proposed  three-year  research  program  on  distributed 
data  management.  The  work  dealt  with  here  is  the  development  of  models 
for  data  distribution.  These  models  consist  of  equations  for  system 
cost,  availability,  and  response  time  in  terms  of  appropriate  parameters 
describing  system  behavior,  usage  patterns,  etc.  This  interim  report 
deals  with  models  which  look  at  the  system  from  a very  high  level.  Low- 
level  features  - strategies,  policies,  etc.  - will  be  built  in  later  so 
that  their  effects  on  cost,  response,  and  availability  can  be  assessed. 

Cost  Model 

Besides  providing  a tool  for  further  research,  the  modeling 
effort  has  yielded  some  immediate  insights  into  the  advantages  - and 
problems  - of  distributed  data  management.  We  have  found,  for  example, 
that  data  distribution  can  be  cost  effective  - in  the  sense  that  it  may 
be  actually  cheaper  to  store  at  a remote  site  - for  reasonable  parameter 
values  and  a not  excessive  cost  differential  between  sites.  In  addition, 
it  appears  that  this  result  is  fairly  insensitive  to  the  size  of  the 
data  set  to  be  transferred,  but  it  does  require  that  the  data  be  compressed 
for  shipment. 

Availability  Model 

Perhaps  the  most  interesting  result  of  the  availability  study 
is  the  following.  If  there  are  two  copies  of  the  data  base  (located  at 
different  sites)  and  both  are  kept  immediately  accessible  and  as  up  to 
date  as  possible,  at  least  one  copy  of  the  data  base  is  available  more 
than  99  percent  of  the  time.  (This  result  does  not  take  into  account 
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scheduled  down-time  or  the  (small)  possibility  that  both  sites  are  down 
concurrently.)  If  the  remote  copy  is  not  a true  "running  spare"  - i.e., 
is  an  inactive  backup  stored,  say,  on  tape  - the  improvement  in  avail- 
ability seems  hardly  enough  to  make  such  backup  worthwhile.  This  result 
serves  to  emphasize  the  importance  of  developing  techniques  for  on-line 
data  base  synchronization. 

Response  Model 

The  study  of  response  time  has  led  to  some  simple  relations 
which  should  be  useful  in  algorithms  for  determining  when  a site  should 
share  its  query  load  with  other  sites  holding  a copy  of  the  data  base. 
Rather  than  being  amenable  to  a priori  study,  the  parameters  appearing 
in  this  model  are  envisioned  as  being  provided  by  real  system  monitoring 
and  measurement,  so  that  they  are  appropriate  for  decision  making  in  a 
dynamic  environment. 

Report  Format 

In  the  next  section  wo  discuss  the  goals  of  the  modeling 
program,  both  in  the  long  term  (as  a research  tool)  and  in  the  short 
term  (i.e.,  for  the  work  presented  here).  Following  this,  we  briefly 
review  work  reported  in  the  literature  which  seems  pertinent  to  our 
effort.  The  major  part  of  the  document  is  then  devoted  to  detailed 
reports  on  the  three  models;  cost,  availability,  and  response  time,  in 
that  order. 

Report  Validity 

The  reader  should  note  that  the  results  given  in  this  report 
are  to  be  considered  tentative.  The  mc^dels  arc  in  the  process  of  revision 
and  refinement,  as  well  as  more  thorough  ti  ting.  This  is  onlv  a prelimi- 
nary report;  conclusions  reached  from  the  models  in  their  present  state 
should  not  be  relied  upon  or  widely  disseminated. 


Goals  of  the  Modeling  Program 


Long-Term  Goals 

Developing  models  to  describe  the  various  aspects  of  distributed 
data  management  is  an  integral  part  of  our  research  program.  Model 
building  - the  development  of  equations  to  describe  system  behavior, 
costs,  etc.  - is  an  essential  tool  in  computer  science  research.  Using 
a good  model,  one  can  study  alternative  design  options,  compare  decision 
strategies,  etc. 

It  is  important  that  the  model  effectively  reflect  the  real 
world  that  is  to  be  studied.  For  this  reason,  we  plan  to  build  a model 
which  is  highly  modular  and  highly  parameterized.  The  modularity  will 
provide  flexibility  and  allow  us  to  study  some  aspects  of  the  problem 
independently  of  having  a detailed  model  of  a whole  system.  For  example, 
network  problems  of  synchronization  and  deadlock  can  probably  be  studied 
through  a high-level  networking  model  which  does  not  concern  Itself  with 
details  of  data  management.  The  parametrizatlon  will  allow  us  to  put 
into  a high-level  model  guessed  values  for  the  effects  of  lower-level 
systems.  In  this  way  we  can  generate  some  Insights  into  what  is  going 
on  before  building  a complete  model.  Parametrizatlon  also  should  allow 
us  to  mimic  real  PWIN  system  behavior  as  closely  as  available  measurement 
data  will  allow.  This  will  maximize  the  PWIN-relevance  of  our  research 
results . 

The  actual  research  areas  which  we  plan  to  study  in  part 
through  modeling  have  been  described  in  some  detail  in  our  Research  Plan 
(CAC  Doc.  No.  16A,  JTSA  Doc.  No.  5510).  In  the  interests  of  brevity,  we 
will  not  repeat  that  information  here. 


Goals  of  this  Preliminary  Study 

In  this  preliminary  work,  we  have  been  limited  by  time  constraints 
to  the  construction  of  fairly  superficial  models  and  to  the  identification 
of  some  promising  directions  for  further  work.  In  order  to  get  a good 
feel  for  the  broad  range  of  model  components  that  may  be  useful  to  us, 
we  planned  a three-pronged  effort,  directed  towards  assessing  the  gross 

i ; 

effects  of  data  distribution  on  costs,  availability,  and  reponse  time. 

I 

Our  approach  has  been  to  survey  the  modeling  literature  for  work  which 

I 

seemed  relevant  to  our  program  and  to  begin  to  extend  such  work  to 
study  the  problems  of  distributed  data  management. 

In  order  to  develop  a cost  model  for  network  data  distribution, 
we  have  begun  with  a cost  model  of  a hierarchical  storage  system 
and  extended  it  by  including  storage  at  a remote  site  as  part  of  the  hierarchy. 

We  have  then  attempted  to  identify  the  major  cost  components  which  the 

i 

network  introduces  into  total  cost.  By  carefully  including  these  j 

network  components,  we  hope  to  have  designed  a basic  model  which  we  ‘ 

may  then  expand  on  (by  providing  a greater  level  of  detail)  to  study  i 

such  things  as  the  cost  overhead  of  various  synchronization  strategies. 

To  study  availability,  we  have  taken  a similar  approach. 

We  have  begun  with  a model  for  single-site  data  base  recovery  strategies  i 

and  tried  to  see  how  the  assumptions  and  results  of  that  model  are  affected  | 

' 

by  locating  the  backup  copy  at  a remote  site. 

The  complexity  of  response-time  models  (Involving,  as  they 
usually  do,  heavy  usage  of  stochastic  analysis  and  queueing  theory) 
precluded  our  getting  very  deeply  into  this  area  in  the  short  term. 

Instead,  we  undertook  a rather  superficial  investigation  into  the  conditions 
under  which  mutiple  data  copies  and  query  distribution  may  lead  to  an 
improvement  in  system  responsiveness. 
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In  summary,  the  primary  goal  of  this  preliminary  effort  has 
been  to  gain  an  understanding  of  the  components  needed  to  model  the 
major  features  of  a distributed  data  management  system.  But  we  feel 
that  the  models  developed,  although  crude,  do  form  a solid  basis  for 
future  work  and  have  already  provided  us  with  some  Insights  into  the 
value  of  data  distribution. 
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Models  In  the  Literature 


Introduction 

The  first  step  in  our  modeling  program  has  been  to  look  closely 
at  relevant  models  reported  in  the  literature.  Actually,  modeling  is  an 
extensively  used  tool  in  computer  science,  and  models  of  one  sort  or 
another  are  found  throughout  the  literature.  For  example,  models  are 
used  for  comparisons  and  evaluations  of  alternative  system  designs. 

Such  mathematical  design  analysis  is  much  less  costly  and  time-consuming 
than  actually  building  alternative  systems  and  trying  them  out.  Models 
are  also  heavily  used  in  optimization  studies.  For  example,  optimal  (or 
near  optimal)  file  allocations  can  be  derived  from  rather  simple  formulas 
for  cost  and  response  time.  The  simplicity  of  the  formulas,  it  should 
be  noted,  is  not  a drawback  of  the  model  but  an  advantage.  Working  with 
models  Instead  of  the  complex  real  world  allows  one  to  focus  attention 
on  only  those  features  that  one  wishes  to  study. 

The  models  that  we  review  in  this  section  are  therefore  generally 
not  complex  and  all-encompassing,  but  are  simple  formulas  which  seem  to 
have  some  relevance  to  problems  of  interest  to  us.  The  discussion  is 
organized  primarily  according  to  the  output  of  the  model,  and  only 
secondarily  according  to  its  context  or  application.  First,  we  consider 
response-time  models  and,  under  this  heading,  other  types  of  models 
dealing  with  time  delays  (e.g.  in  a network)  or  the  time  it  takes  a 
process  to  be  carried  out.  Thus  we  have  collected  together  various 
models  which  may  have  some  relevance  to  the  overall  response  time  of  a 
distributed  data  base.  Second,  we  consider  very  briefly  the  closely 
related  concept  of  throughput , and  techniques  for  modeling  it. 
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Third,  we  review  various  models  which  may  be  useful  to  the 
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study  of  data  availability  - essentially  the  probability  that  the  data 
base  is  accessible  when  needed.  Overal]  availability  may  involve  such 
factors  as  network  reliability,  system  failures,  and  recovery  strategies  - 
all  of  which  have  been  individually  modeled  in  some  context.  Finally, 
we  look  briefly  at  cost  models  - valuable  for  their  ability  to  encompass 
all  kinds  of  resource  utilization,  but  not  nearly  so  extensively  studied 
as  the  other  types  of  models. 

Models  for  Response  Time 

An  important  quantity  for  the  evaluation  of  a data  management 
system  is  the  expected  response  time,  which  may  be  defined  as  the  average 
waiting  time  from  the  initiation  of  a data  request  (or  from  the  input  of 
a query)  to  the  receipt  of  the  information.  Many  different  aspects  of  a 
data  management  system  have  an  effect  on  the  total  response  time.  These 
aspects  run  from  the  low-level  physical  organization  of  data  to  (in  a 
distributed  environment)  network  delay  times.  In  this  section  we  briefly 
review  some  important  past  work  on  modeling  these  various  aspects  and 
indicate  where  further  work  appears  needed  to  model  a complete  system. 

Data  structure  modeling.  At  the  lowest  level,  models  have 
been  developed  to  aid  in  choosing  storage  schema.  A typical  approach  is 
that  of  Gotlieb  and  Tompa  [1974].  They  consider  a number  of  alternative 
structures  - trees,  linked  lists,  etc.  - and  assume  an  expected  usage 
pattern  which  involves  the  probabilities  that  the  various  nodes  in  the 
schema  will  be  accessed.  They  then  compute  expected  run-time  costs  for 
the  alternatives.  These  "costs"  are  actually  timing  estimates,  being 
computed  as  a simple  linear  combination  of  "the  number  of  executions  of 
each  of  three  primitive  instruction  types:  memory  accesses,  arithmetic 
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and  logical  instructions,  and  transfers  of  control".  The  expected 
number  of  executions  of  the  different  commands  are  obtained  by  simulation 
studies  of  application  programs.  This  last  point  is  an  important  one  - 
at  some  stage  in  almost  any  modeling  effort,  data  from  simulations  or 
from  the  measurement  of  real  system  behavior  is  needed.  Another  point 
to  note  in  Gotlieb  and  Tompa's  work  is  that  some  very  Important  considera 
cions  in  choosing  storage  schema  appear  only  as  constraints.  For  example 
an  upper  bound  on  the  allowable  amount  of  storage  space  Is  used  to 
eliminate  certain  schema  from  further  scrutiny.  A model  which  would 
allow  for  a trade-off  between  storage  cost  and  access  efficiency  would 
seem  to  have  more  validity. 

At  what  is  perhaps  a higher  level,  Shneiderman  [1974]  has 
developed  a model  for  optimizing  the  structure  of  multilevel  indexes. 
Again,  he  describes  his  model  as  a "cost"  model,  but  he  is  explicitly 
computing  access  times.  His  approach  is  a very  simple  one.  Assuming 

1.  a given  number  of  levels, 

2.  the  branching  pattern  of  the  index  tree, 

3.  a strategy  for  searching  the  tree, 

4.  the  costs  (times)  for  moving  from  node  to  node  in  the  search, 
and 

5.  an  equal  probability  of  request  for  all  items, 

he  derives  a straightforward  algebraic  formula  for  expected  search  time. 
As  an  obvious  (and  necessary)  generalization,  be  suggests  relaxing 
assumption  (5).  Shneiderman ' s basic  approach,  however,  appears  to  he  a 
useful  one  which  may  be  readily  incorporated  into  any  analyses  of  tree 
structures  which  arise  in  our  modeling  effort. 


A more  ambitious  effort  in  the  use  of  modeling  to  evaluate 


file  structures  was  carried  out  by  Winkler  and  Dale  [1971].  In  their 

words,  they  study  "the  processing  time  required  to  evaluate  Boolean 

functions  defined  on  data  values  ...  [and  to]  select  elements  from  the  ^ 

J 

structure  satisfying  the  expression".  They  derive  some  rather  complex 
algebraic  formulas  for  expected  processing  time.  There  are  over  twenty 
input  parameters  describing  such  things  as  properties  of  the  average 
query,  file  size  and  timing  data.  Specific,  alternative  structures  are 
modeled  in  the  sense  that  processing  time  formulas  are  developed  for 
them.  This  paper  merits  closer  study  in  our  proposed  work  on  data 
structuring . 

Computer  system  modeling.  Many  of  the  response-time  models  in 
the  literature  that  may  be  of  use  to  us  are  not  specifically  concerned 
with  data  management  but  with  computer  systems  in  general.  Both  time- 
sharing systems  and  multiprogramming  systems  have  been  the  subject  of 
considerable  analysis.  Both  situations  are  characterized  by  competition 
for  shared  resources.  Several  jobs  reside  in  the  system  simultaneously 
and  must  occasionally  wait  for  processing,  I/O,  etc.  The  natural  mathemat- 
ical models  to  describe  the  progress  of  jobs  through  such  a system  of 
waiting  lines  and  processors  are  those  of  queueing  theory.  Indeed, 
queueing  theory  has  been  heavily  and  successfully  used  to  develop  formulas 
for  response  time  in  such  systems. 

A classic  example  is  Scherr's  analysis  of  response  time  for 
time-sharing  systems  [Scherr,  1967].  Scherr  defines  response  time  as 
the  mean  length  of  time  the  user  spends  in  the  "working  part  of  the 
interaction"  - i.e.,  the  time  between  when  he  finishes  typing  in  his 
query  and  when  the  response  is  returned  to  him.  The  main  input  [larameters 
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are  the  mean  time  per  interaction  that  the  user  spends  in  thinking  and 
tvping,  and  the  mean  processor  time  per  interaction.  Simplifying  assump- 
tions are  that  the  system  is  in  a steady  state  (i.e.,  essentially  that 
the  total  number  n of  users  on  line  is  constant)  and  that  there  is  no 
overhead  due  to  additional  swapping  as  n increases.  The  latter  assump- 
tion is  questionable  and  leads  to  the  result  that  response  time  increases 
only  proportionately  to  n for  largo  n.  The  mathematical  analysis  is 
quite  simple.  A Markov  process  describes  the  probability  distribution 
for  the  number  of  users  actually  Inside  the  system  and  the  resulting  set 
of  recursive  equations  are  readily  solved.  Expected  response  time  can 
be  immediately  calculated  from  this  probability  distribution.  The 
validity  of  this  simple  queueing  model  was  demonstrated  by  comparing  its 
predictions  with  real  system  measurements.  The  agreement  was  extremely 
close. 

More  elaborate  analyses  of  t ime-sliar  ing  systems  have  been 
carried  out  by  Kleinrock.  (For  a good  review  of  this  work,  sec  iKleinrock, 
1971].)  Kleinrock's  analyses  include  various  queueing  disciplines 
(scheduling  algorithms)  and  various  probabilistic  assumptions  on  iob 
arrival  times  and  processing  time  required.  He  has  extended  this  type 
of  model  virtually  to  its  limit,  in  the  sense  that  furtlier  generaliza- 
tions lead  to  intractable  mathematical  formulations. 

Queueing  models  also  play  a key  role  in  the  stiidy  of  multi- 
programming systems.  I'hese  differ  from  time-sharing  systems  primarily 
in  that  there  is  no  assumed  interlude  when  tlie  user  is  thinking  and 
tvping.  That  is,  a certain  steady-state  population  of  jobs  is  assumed 
to  be  continually  moving  through  the  system.  A model  wlilch  seems 


(.•specially  relevant  to  our  work  is  that  of  Arora  and  Gallo  [1971],  who 
are  particularly  interested  in  the  optimal  storag.^  of  data  in  a multi- 
level memory.  They  define  the  expected  response  t ime  of  a transaction 
as  "the  serial  sum  of  the  service  times  along  with  the  respective 
waiting  times  at  all  facilities",  and  emphasize  the  importance  of  this 
statistic  in  evaluating  data  management  systems.  The  most  important 
parameters  in  their  model  are  the  I/O  dependent  timings,  such  as  the 
access  times  to  various  memory  devices  and  the  time  required  to  transfer 
a block  of  data  from  auxiliary  to  main  memory.  The  model  is  rather 
detailed  and  complex,  but  has  the  obvious  potential  to  be  extended  to 
the  study  of  data  distribution  in  a network.  One  need  only  consider 
some  memory  levels  to  be  located  remotely  and  take  into  account  network 
delay  times. 

File  allocation  modeling.  Models  which  have  been  devised  to 
study  data  distribution  are  usually  developed  from  higher  level  (and 
less  sophisticated)  analyses  than  those  referred  to  above.  An  example 
is  the  response-time  formula  derived  by  Chu  [1973]  in  his  studv  of 
optimal  file  allocation  in  a network.  Variables  in  his  formula  include 
line  traffic  between  nodes  (assuming  it  is  all  generated  from  data  base 
access),  usage  rates  of  files  by  users  at  various  sites,  and  average 
lengths  of  messages.  An  interesting  feature  is  the  result  that  network 
transmission  delays  increase  with  line  traffic  according  to  the  simple 
factor  ?/(l-P),  where  P denotes  the  fraction  of  line  capacity  used  by 
the  given  traffic,  or  traffic  intensity.  Indeed,  Chu's  expected  rt -iponse 
rime  formula  (for  queries  initiated  at  one  given  site  and  responded  to 
by  another)  is  simply 

Response  Time  7 tP/(l  - P), 
when'  t = average  time  to  transmit  a replv  message. 

1 1 


One  sees  that  manv 


features  are  lacking  in  this  simple  model  - time  to  transmit  requests, 
time  to  access  the  data  at  the  remote  site,  protocol  overhead  time,  etc. 

In  addition,  there  appears  to  be  an  implicit  assumption  that  all  pairs 
of  sites  are  connected  by  a direct  line  used  only  for  the  query  traffic. 

Network  delay  modeling.  As  we  noted  in  discussing  Chu ' s 
simple  model  for  response  time  from  a remote  site,  real  network  delays 
involve  many  complex  factors.  Fortunately,  much  work  has  been  done  on 
developing  realistic  formulas  for  network  delays.  (For  a good  review 
see  [Kleinrock,  1973].)  This  work  has  been  largc'ly  done  in  the  setting 
of  network  design  and  analysis.  For  example,  queueing  models  have  been 
used  to  compute  average  packet  delays  for  given  network  topology,  routing 
strategy,  and  network  traffic  (including  overhead  for  routing,  flow  and 
error  control,  etc.).  There  seems  to  be  no  reason  why  such  models  can 
not  be  incorporated  into  overall  models  of  response  time  for  a distributed 
data  base.  Detailed  modeling  of  network  delays  will  provide  a necessary 
tool  for  studying  synchronization  strategies  and  other  network-related 
features  of  distributed  data  management. 

Models  for  Throughput 

Some  authors  argue  that  response  time  is  not  as  important  a 
statistic  as  is  throughput . For  example,  Arora  and  Gallo  [1971]  put  the 
case  as  follows:  "In  a multi-programming  environment  tlie  response  time 

docs  not  measure  the  efficiency  of  the  system,  because  of  the  concurrent 
processing  of  several  transactions.  For  this  reason,  we  introduce 
throughput  rate  as  a performance  measure  for  the  multi-programming 
systems.  It  is  tlie  rate  of  complet ion  of  transact  ions  per  un i t t ime . " 

(The  underlining  is  ours.)  In  his  analysis  of  mvil t iprogramming  systems, 
Buzen  (1971]  takes  the  same  point  of  view,  defining  "overall  systc'm 
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performance"  as  the  "average  number  of  jobs  processed  per  unit  time".  A 
queueing  theory  analysis  will,  however,  generate  either  response  time  or 
throughput  rate  with  equal  ease.  That  is,  these  models  assume  that  a 
certain  number  of  jobs  are  in  the  system  and  essentially  it  is  the  time 
the  average  job  spends  in  the  system  that  is  computed.  Thus  "response 
time"  in  these  models  never  means  the  absolute  time  it  takes  an  other- 
wise empty  system  to  do  the  job,  but  is  always  in  the  context  of  com- 
petition with  other  jobs. 

In  network  analysis  throughput  has  also  been  a useful  statistic. 
For  example,  in  ARPANET  analyses  the  network  throughput  has  been  defined 
as  the  average  traffic  per  node  when  average  packet  delay  equals  0.2 
seconds  [Frank  and  Chou,  1974].  This  maximum  acceptable  average  time 
delay  then  gives  meaning  to  the  notion  of  throughput  or  "the  level  of 
traffic  that  the  network  can  handle".  However,  it  is  again  queueing 
analysis  which  is  used  to  model  the  flow  of  packets  through  the  network 
and  to  compute  throughput  under  various  conditions. 

Once  throughput  or  level  of  traffic  flowing  through  a system 
becomes  a statistic  of  interest,  the  possibility  of  using  models  analogous 
to  those  used  for  physical  flow  systems  arises.  The  stochastic  models 
and  recursion  equations  of  queueing  theory  may  be  replaced  by  the  contin- 
uous models  and  differential  equations  of  diffusion  theory.  There  has 
recently  been  considerable  interest  in  applying  this  type  of  model  to 
queueing  networks  (see,  for  example,  [Reiser  and  Kobayashi,  1974]  j, 
since  more  complex  initial  and  boundary  conditions  can  be  imposed  than 
are  tractable  in  stochastic  models.  Of  course,  the  close  connection 
between  throughput  and  average  response  time  means  tliat  diffusion  mmiels 
could  be  very  useful  in  response-t ime  studies.  We  tlierefore  plan  to 
look  closely  into  the  applicability  of  diffusion  models  to  our  research. 
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Models  for  Availability 


We  here  use  the  term  availability  to  mean  the  fraction  of  time 
that  a data  base  is  available  to  respond  to  user  requests  or  queries. 

In  any  setting,  and  particularly  in  a network,  availability  is  a func- 
tion of  the  reliability  (or  availability)  of  many  components  - host 
computers,  network  communications  lines,  etc.  - as  well  as  of  strategies 
for  backup  and  recovery.  In  this  section  we  discuss  some  of  the  past 
modeling  research  that  has  yielded  results  useful  to  us  in  our  concern 
with  database  availability. 

File  allocation  modeling.  One  of  the  factors  to  be  taken  into 
account  in  distributing  copies  of  a file  to  various  network  sites  is  the 
number  of  copies  needed  for  an  acceptable  degree  of  availability.  Chu 
[1973]  takes  account  of  this  factor  in  the  following  way.  First,  he 
defines  the  availability  of  a piece  of  equipment  (e.g.,  communication 
line  or  computer)  as 

p 

Availability  = y ^ 

where  F is  the  mean  time  between  failures  and  X is  the  mean  time  to 
repair.  Then,  assuming 

1)  all  computers  in  the  network  have  identical  availability  A, 

2)  all  communication  channels  have  identical  availability  c,  and 

3)  the  network  is  completely  connected; 

Chu  obtains  the  following  formula  for  the  availability  of  the  jth  file: 

r . 

A(1  - (1  - Ac)  J), 

where  r^  is  the  number  of  copies  of  the  jth  file  in  the  network.  Once  A 
and  c are  known,  it  is  a simple  matter  to  choose  r^  so  as  to  bring  the 
availability  of  a remote  copy  up  to  a satisfactory  level.  Overall 
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availability  is  bounded  by  A,  the  availability  of  the  requesting  com- 
puter, which  is  apparently  assumed  not  to  possess  a copy  of  the  file. 

Although  Chu's  model,  with  its  assumption  of  complete  homo- 
geneity of  network  components,  may  seem  oversimplified,  an  analogous 
analysis  can  be  readily  carried  out  in  the  heterogeneous  case  to  yield 
more  complex  expressions.  Notice,  however,  that  this  model  present.s 
another  problem.  It  implicitly  assumes  that  the  files  are  static,  or 
are  simultaneously  kept  up  to  date  by  some  trouble-free  process.  In 
fact,  the  development  of  algorithms  to  keep  segments  of  a data  base 
identical  (or  nearly  so)  is  a topic  of  current  research.  (See  the 
chapter  on  Automated  Backup  in  CAC  Doc.  No.  162,  JTSA  Doc.  No.  5509.) 

Network  reliability  modeling.  Another  simplification  in  Chu's 
model  is  the  assumption  that  a direct  communication  line  connects  every 
pair  of  sites.  This  assumption  allows  Chu  to  use  a single  parameter  to 
describe  availability  of  a link  from  one  site  to  another.  In  a general 
network,  this  availability  will  depend  in  a complex  way  upon  network 
topology.  Several  alternate  paths  may  exist  between  two  given  sites. 

Each  of  these  paths  may  involve  more  than  one  "hop"  and  so  more  than  one 
piece  of  subnet  hardware.  Indeed,  in  the  ARPA  network  it  has  been  found 
that  the  failure  rate  for  IMF's  is  about  the  same  as  that  for  communica- 
tion channels,  and  that  IMP  failures  therefore  have  the  more  drastic 
effect  on  communications  reliability  [Frank,  Kahn,  and  Kleinrock,  1972]. 
Graph  theoretical  techniques  for  computing  availabilitv  from  mimponent 
reliabilities  are,  however,  well  known.  The  paper  bv  Frank  et  al.  con- 
tains a brief  review  of  these  techniques.  No  great  difficulty  is  envi- 
sioned in  applying  them  to  any  given  network  (sucli  as  the  WIN)  to  obtain 
availabilities  which  may  then  he  used  in  a straightforward  extensiem  of 
Chu's  model  to  obtain  rough  estimates  of  file  (or  data  hast')  availabilitv. 
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Modeling  computer  system  reltabillty.  Another  parameter  in 
Chu's  model  that  requires  more  detailed  analysis  for  complete  understanding 
is  computer  availability.  One  source  of  information  on  computer  avail- 
ahilitv  is  direct  svstem  measurement.  On  a lower  level,  however, 
failures  can  be  modeled  to  yield,  in  addition  to  overall  figures  on 
expected  svstem  reliability,  useful  insights  into  repair  and  backup 
St  rategies . 

Borgerson  and  Freitas  [1975]  recently  published  a fairlv 
detailed  stochastic  model  for  computer  system  failure.  Their  model  is 
based  on  four  distinct  causes  of  crashes  and  their  interrelationships, 
rtieir  ultimate  result  is  a formula  giving  the  probability  density  for 
the  event  that  the  system  crashes  due  to  a failure.  The  effects  of 
mechanisms  for  detecting  and  recovering  from  a failure  (before  the 
svstem  actually  crashes)  are  included  in  the  analysis.  Although  our 
research  is  unlikely  to  be  concerned  with  modeling  computer  systems  at 
this  level  of  detail,  the  analytical  techniques  of  Borgerson  and  Freitas 
may  well  apply  to  reliability  problems  which  we  may  wish  to  model  (e.g. 
protocol  resiliency) . 

Modeling  backup  and  recovery  strategies.  This  section  has 
previously  dealt  with  availability  questions  involving  network  and  site 
reliabilities.  On  a lower  level,  the  data  base  itself  may  "crash"  or 
nav  acquire  errors.  It  is  important  that  strategies  for  returning  a 
data  base  to  its  correct  state  be  devised  and  studied. 

A recent  paper  [ Chandy  et  al . , 1975]  provides  models  for 
rollback  and  recovery  strategies.  These  strategies  run  as  follows.  At 
certain  points  in  time  (checkpoints) , a copy  of  the  data  is  made  and 
stored.  A listing  of  subsequent  data  updates  (i.e.  an  aiul  i_t  ^ra^H ) is 
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then  kept.  When  the  master  data  base  fails,  it  may  then  be  recovered  by 
beginning  with  the  old  copy  from  the  checkpoint  and  using  the  audit 
trail  to  bring  it  up  to  date.  Chandy  et  al.  use  queueing  theory  to 
model  the  processing  of  the  audit  trail.  From  the  expected  time  to 
complete  this  process,  they  can  compute  the  total  recovery  time.  The 
length  of  the  audit  trail,  and  hence  the  time  to  recover,  is  a function 
of  the  time  interval  between  checkpoints.  Optimization  of  availability 
with  respect  to  intercheckpoint  time  can  then  be  carried  out.  Models  of 
some  complexity  are  developed  which  take  into  consideration  the  possi- 
bility of  errors  during  recovery  and  the  possibility  of  a transaction 
arrival  rate  which  varies  in  a cyclic  manner  (as  opposed  to  being  con- 
stant). The  results  appear  to  be  very  useful  for  developing  insights 
into  recovery  strategies,  particularly  for  single-site  systems.  In  a 
network  environment,  however,  it  mav  be  reasonable  to  assume  that  the 
bai'kup  copy  is  stored  remotely.  In  this  case  it  does  not  make  sense  to 
assume  that  the  data  is  always  restored  from  the  backup,  because  of  the 
long  time  required  to  transfer  a data  base  through  the  network.  The 
strategy  then  is  to  transfer  the  queries  to  the  available  cony.  (See  the 
later  section  on  the  availability  model.) 

Models  for  Cost 

Cost  is  both  a very  vague  and  ambiguous  measure  of  system 
performance  and  a very  important  one.  The  ambiguity  comes  about  through 
the  difficulty  of  assigning  dollar  costs  to  all  factors  of  interest. 

One  way,  of  course,  is  to  carry  out  experiments  - i.e.,  to  run  test 

programs  at  various  sites  and  compare  the  bills  received.  This  method 

yields  cost  comparisons  which  are  heavily  dependent  on  the  pricing 

policies  of  the  various  sites,  as  well  as  on  site  hardware  and  software. 
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Untangling  all  of  these  factors  to  determine  what  a set  of  cost  figures 
really  means  is  no  easy  task.  On  the  other  hand,  cost  is  very  important 
in  that  it  serves  as  an  overall  measure  of  system  resource  utilization. 

For  example,  by  assigning  costs  to  them,  such  diverse  factors  as  CPU 
time  and  storage  used  can  be  added  together.  In  short,  costs  are  a 
device  by  which  one  can  add  together  apples  and  oranges. 

Assignment  of  specific  costs  to  various  factors  is  of  Importance 
to  the  model  user,  but  not  necessarily  to  the  model  builder.  The  latter 
can  consider  costs  of  various  resources  to  be  simply  weighting  coeffi- 
cients, which  can  be  adjusted  at  will  to  reflect  a specific  environment. 

It  may  be,  for  example,  that  no  real  money  changes  hands.  But  a user 
may  still  wish  to  evaluate  a certain  system  or  piece  of  software  by 
using  a formula  which  weights  storage  (which  may  be  in  short  supply) 
much  more  heavily  than  CPU  time. 

In  this  brief  review  of  cost  models,  we  will  be  only  concerned 
with  those  which  use  "costs"  to  add  together  heterogeneous  factors.  In 
our  search  of  the  literature  we  found  that  several  so-called  "cost" 
models  actually  dealt  only  with  time  factors.  Such  models  are  therefore 
discussed  elsewhere. 

Modeling  Network  File  Allocation.  Of  particular  relevance  to 
our  study  of  distributed  data  management  are  the  cost  analyses  developed 
for  the  network  file  allocation  problem.  A good  example  of  such  an 
analysis  is  that  given  by  Casey  [1972].  The  parameters  in  his  model  are 

1.  the  cost  ("mainly  for  storage")  of  locating  the  file  at  any 
site  k , 

2.  the  costs  of  transmitting  a given  amount  of  data  between  two 
given  sites  (with  the  possibility  that  update  and  query  trans- 
actions may  bo  transmitted  at  different  costs). 
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3.  the  amount  of  update  traffic  emanating  from  each  site,  and 

4.  the  amount  of  query  traffic  emanating  from  each  site. 

Given  values  for  these  parameters,  the  cost  of  a particular  allocation 
is  readily  computed. 

Casey  states  that  transmission  costs  may  be  "a  rather  complex 
monotonically  increasing  function"  of  traffic,  but  he  feels  that  his 
linear  model  is  a good  first  approximation.  A better  idea  of  transmission 
costs  would  require  a model  which  goes  into  the  transmission  process  in 
some  detail  and  analyzes  the  various  cost  components  and  how  they  are 
affected  by  the  amount  of  network  traffic.  The  site  costs  might  also 
profit  from  a detailed  breakdown;  note  that  Casey  remarks  that  factors 
other  than  storage  are  being  lumped  into  one  term.  It  is  important  to 
realize,  however,  that  for  file  allocation  Casey's  model  is  probably 
quite  adequate.  It  is  only  when  one  wishes  to  study  other  aspects  of 
data  distribution  - backup  and  recovery  strategies,  say  - that  more 
detail  is  needed. 

Modeling  storage  hierarchies.  Even  before  networks  existed, 
the  file  allocation  problem  was  of  importance.  The  question  arose  as  to 
where  one  should  place  a given  file  in  a storage  hierarchy  - i.e.,  a set 
of  memory  devices  of  varying  accessibility  (core,  disk,  tape,  etc.) 
connected  to  a single  computer.  A particularly  comprehensive  cost  model 
for  this  problem  has  recently  appeared  [bum  et  al.,  1975].  This  model 
differentiates  between  random  and  sequential  forms  of  data  access  and 
Includes  considerations  of  staging,  channel  costs,  CPU  overhead,  etc. 
Because  of  its  completeness,  we  considered  this  model  an  appropriate  one 
for  extension  to  the  network  case.  That  is,  memorv  devices  at  a remote 
site  may  simply  be  considered  as  parts  of  the  storage  hierarchv,  provided 
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that  network  costs  are  properly  taken  into  account.  A detailed  discussion 
of  the  model  of  hum  et  al.  will  therefore  appear  below  in  the  discussion 
of  our  cost  model. 

The  distributed  data  management  problem  is  of  course  far  more 
complex  than  the  storage  hierarchy  problem.  The  model  of  bum  et  al . (and 
our  extension  of  it)  assumes  that  all  data  processing  (updating  and 
responding  to  queries)  takes  place  in  local  core.  No  provision  exists 
for  sending  a query  to  a remote  site  for  processing.  Thus,  although  our 
straightforward  extension  of  bum's  storage  hierarchy  model  has  provided 
some  insight  into  data  distribution,  it  is  grossly  Inadequate  for  studying 
all  the  many  facets  of  distributed  data  management.  Unfortunately,  the 
literature  contains  little  modeling  work  that  is  readily  applicable  to 
distributed  data  management.  Much  more  work  needs  to  be  done  to  develop 


models  which  realistlcal Iv  describe  the  distributed  environment. 
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A Cost  Model  for  Data  Distribution 


Introduction 

The  advantages  of  distributing  a data  base  in  a network 
environment  have  been  discussed  at  length  in  various  papers,  panel 
discussions,  and  bull  sessions.  But  it  has  been  somewhat  difficult  to 
quantify  these  advantages  or  to  investigate  the  various  tradeoffs  and 
determine  just  how  great  the  advantages  are.  In  this  section,  we  will 
attempt  to  shed  some  light  on  this  subject.  As  mentioned  above,  a 
recent  paper  by  bum  et  al.  [1975]  develops  a cost  algorithm  for  allo- 
cating files  in  a storage  hierarchy.  Their  cost  model  is  rather  com- 
plete and  lends  itself  well  to  extensions  relevant  to  storage  hierarchy 
problems  in  distributed  data  base  systems. 

For  many  of  the  cost-related  questions  that  arise  in  the 
development  of  a distributed  data  base  system  (such  as  those  concerned 
with  the  costs  of  queries,  updates,  back-up,  recovery,  etc.),  the  system 
can  at  first  be  viewed  as  a storage  hierarchy.  That  is,  to  a local 
process  or  user  the  remote  sites  appear  as  further  levels  of  the  hier- 
archy. From  this  point  of  view  the  network  is  another  channel  with  some 
special  cost  considerations.  In  future  refinements  of  this  model,  we 
plan  to  include  effects  of  remote  processing  of  data.  We  were  unable 
to  do  so  in  this  short-term  effort. 

In  what  follows  we  will  first  review  the  model  described  in 
[bum  et  al.,  1975].  (In  order  to  facilitate  the  discussion,  this  model 
will  be  referred  to  henceforth  as  the  bSWI^  model.)  Next  wt>  will  extend 
the  bSWI,  model  to  include  a network.  Then  we  will  use  the  model  along 
with  some  relevant  data  to  investigate  some  interesting  questions,  and 


we  will  draw  some  conclusions  about  the  advantages  of  distributed  data 
base  systems.  Finally,  we  list  some  ideas  for  refining  the  model. 

A Review  of  the  LSffl.  Model 

Overview.  The  I.SWL  model  primarily  addresses  the  problem  of 
"data  staging"  or  "data  migration".  In  other  words,  when  a file  or  data 
set  is  not  being  used  (l.e.,  is  inactive)  it  is  stored  on  one  device 
(usually  a slower,  less  expensive  one).  Then,  when  the  data  set  is 
accessed,  it  is  moved  to  a faster,  more  expensive  device  so  that  the 
program  will  waste  fewer  resources  waiting  for  data.  The  question  we 
are  concerned  with  here  is,  given  the  accessing  characteristics  (number 
of  reads  and  writes,  proportion  of  time  the  file  is  in  use,  etc.),  where 
in  a given  hierarchy  should  the  data  set  be  stored  when  it  is  inactive 
and  where  should  it  be  moved  when  it  is  active? 

The  authors  develop  an  objective  function  which  gives  the  cost 
of  accessing  a data  set  which  is  stored  on  one  device  when  Inactive  and 
another  (possibly  the  same  device)  when  active.  The  authors  assume  as  a 
first  approximation  that  the  entire  data  set  is  moved  from  the  inactive 
device  to  the  active  one. 

The  selection  algorithm  is  quite  straightforward.  The  objec- 
tive function  is  evaluated  for  a given  set  of  variables  for  each  pair  of 
devices  in  the  hierarchy.  The  lowest  cost  then  indicates  on  which  pair 
of  devices  the  data  should  be  located. 

Assumptions.  The  authors  make  several  simplifying  assumptions 
most  of  which  can  be  relaxed  at  the  cost  of  a more  complex  cost  function 
They  assum<?  that  for  data  sets  system  paging  activity  will  not  signifi- 
cantly affect  cost.  However,  it  would  probably  he  necessary  to  relax 
this  constraint  if  one  wished  to  consider  costs  incurred  by  program 
activity.  They  further  assume  that  transfers  are  direct  r.ither  than 
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through  core  and  that  there  are  no  flow  control  problems  (l.e.,  a fast 
device  can  always  accept  data  from  a slow  device).  It  is  also  assumed 
that  transfers  are  not  constrained  by  the  capacity  of  the  device  the 
data  set  is  being  moved  to.  These  last  two  assumptions  can  both  be 
dropped  at  the  cost  of  a more  complex  equation.  As  we  shall  see,  when 
we  add  a network  to  the  hierarchy,  flow  control  can  not  be  ignored. 

When  a process  or  user  accesses  a data  set,  it  often  must  wait 
for  the  access  to  complete.  Clearly,  this  wait  time  must  be  figured 
into  the  total  cost.  However,  multiprogramming  systems  take  advantage 
of  this  wait  time  by  letting  other  processes  utilize  the  processor.  To 
account  for  this  the  authors  define  an  adjusted  machine  cost,  m.  For 
lack  of  a better  formulation,  they  have  defined  this  cost  to  be  the 
percent  of  CPU  idle  time  times  the  dollar  cost  associated  witli  the  CPU. 
There  are  some  problems  with  such  a definition.  For  example,  as  the 
load  on  the  system  increases  and  so  does  CPU  utilization,  queueing 
delays  and  system  overhead  also  Increase,  thus  increasing  cost.  The 
objective  function  does  not  account  for  this  phenomenon. 

The  objective  function.  Now  that  we  liavc  reviewed  tlie  assump- 
tions behind  this  analysis,  let  us  look  at  the  cost  function  itself  in 
some  detail.  The  reader  should  consult  Table  1 for  a key  to  tlie  defini- 
tion of  the  symbols  used  and  Figure  1 for  a summary  of  the  objective 
f unc  t ion . 

Let  us  assume  that  the  data  set  is  at  level  i of  tlie  hierarchy 
when  inactive  and  level  j when  active.  (For  consistency  we  will  adopt 
the  nomenclature  used  by  hum  et  al.  whereby  tlie  first  subscript  will  be 
the  inactive  device,  and  the  second  the  active  one.  Also  the  higher 
levels  (i.e.,  those  with  faster  access)  of  the  hii'rarchv  will  have 
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Data  Set  Characteristics: 


q = number  of  sequential  block  assesses, 
r = number  of  random  block  accesses. 

S = data  set  size, 
s = physical  block  size. 

T.  = fraction  of  time  data  set  is  on  level  i. 

1 

d = number  of  times  the  data  set  is  opened. 

X = the  proportion  of  time  to  write  the  data  set  back  to  its 
original  position.  For  read  only  data  sets,  X = 0;  for 
full  write  back  at  read  speed  X = 1. 


Storage  Device  Characteristics: 


t 


i 

r 


= random  access  time  for  level 


i . 


t^^  = sequential  access  time  for  level  1. 


t ^ = transmission  rate  for  level  i. 
s 

t^^  = average  revolution  latency  time  for  level  i. 

t ^ = minimum  access  arm  movement  time, 
c 

n.  = unit  cost  of  storage  space  at  level  1 for  the  given  time 
period . 

b.  = transfer  size  per  access  when  data  set  is  being  moved  from 
a lower  level  i to  another  level  (or  from  a higher  level  to 
level  i) . 

= largest  size  that  can  be  transferred  without  additional  access 
cost . 


CPU  and  Channel  Characteristics: 


m = adjusted  cost  per  unit  time  for  computer  system  excluding 
channel 

M = unadjusted  computer  system  cost  per  unit  time 
u = cost  of  channel  per  unit  time 
3 = number  of  buffers 

W = computer  setup  time  for  opening  a data  set 

Table  1 


Parameters  in  the  LSWL  Model 
(from  [Lum  et  al.,  1975]) 


higher  indices,)  The  objective  function  can  be  considered  to  have 


I 


L 


three  major  terms: 


, , staging 

^ storage  , local  process  , , 

f.  = ^ + transfer 

11  cost  access  costs 

costs 


The  first  term  is  the  cost  of  storing  the  data  on  the  active 
and  inactive  devices. 


storage  cost)  = [t.n.  + i.n.jS 
1 t 1 J 


liflien  a data  set  is  moved  from  level  i to  level  j it  is  not  necessarily 


deleted  from  level  i;  therefore  it  should  be  noted  that  t.  + t.  > ]. 

1 J - 


The  second  term  is  the  cost  for  the  user  or  process  to  access 
the  data  from  the  active  device.  This  term  takes  into  account  the  CPU 
costs  and  transfer  overhead  as  well  as  channel  costs  for  both  random  and 
sequential  accesses. 


CPU  costs  for  r.  i 

^ , = mq[ (t  '/S)  + (s/t  ■ ) 1 

sequential  access  q s 


CPU  costs  for 


, mr[t-^  + (s/t'^)l 

random  access  r s 


random  access 


= ur [t  ■'  + (s/t  1 
channel  costs  1 s 


sequential  access  ^ J/g)  ^ i), 

channel  costs  1 s 


The  final  term  computes  the  cost  of  moving  the  data  from  level 
i to  level  j and  Includes  factors  for  writing  the  data  back  to  level  j 
if  necessary,  preparation  for  transfer,  latency  waiting  for  the  next 
block,  and  block  transmission  costs. 


cost  to  move  data  from  i . / , / i, 

, ...  1 , • = (1  + X)dlMVI  + (S/h.)  mt,  + (mb./t  ) 

level  1 to  level  j i 1 i s 


+ (ub./t  ')]  + (mS/lOt  '}r(i  - j), 
IS  1 c 


where  I’(x)  is  0 if  x = 0 and  is  1 otherwise. 
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storage  cost 


f..  = ft.n.  + T.n.}S  + 

1,1  It  J ,1 

mq[(t^Vp)  + (s/t^^)l  + 
mr(t^-^  + (s/t^')|  + 

{uql(tj’/3)  + (s/t^bl  + 

ur[tj’  + (s/t^^) 1 } + 

(1  + ,\)d{MW  + (S/bj)[mtj^  + (mb^/t_,^)  + 

(ub,/t  Si  + (mS/B.)t  ^}{r(i  - ])} 
is  1C 


CPU  cost:  sequential 

access  + transmission 

CPU  cost:  random 

access  + transmission 


channel  cost 


cost  to  move  the 
data  set  between 
levels  i and  j 


Figure  1 

Objective  Function  for  the  LSWI,  Model 


Network  Model 

Further  extensions  than  those  discussed  here  are  necessary  to 
model  the  cost  of  a distributed  data  management  system  in  complete 
detail.  However,  the  model  developed  here  is  a good  first  approximation 
and  will  allow  investigation  of  the  tradeoffs  between  storage  and  access 
economy.  It  will  also  provide  an  accurate  model  of  file  or  dat.'i  set 
staging  in  a network. 

As  mentioned  earlier,  a primary  concern  in  extending  the  hSWh 
model  to  allow  for  a network  in  the  hierarchy  is  to  account  for  the  flow 
control  and  other  protocol  related  costs  that  will  be  incurred.  The 
cost  function  used  has  the  basic  form: 


1 1 


f" 

i > k 

r'j 

i < k 

(i  always  greater  than  k) 


where  k is  the  first  remote  level  of  the  hierarchy.  (Here  we  are 
tacitly  assuming  that  all  staging  will  be  done  to  a local  device'.)  We 


2b 


have  already  discussed  the  original  objective  function,  f • We  will 
now  proceed  to  consider  the  cost  function  that  deals  with  the  network. 
The  reader  is  directed  to  Table  2 for  a key  to  additional  symbols  and  to 
the  summary  of  in  Figure  2.  The  network  cost  function  can  be  char- 

acterized as: 


I 

I 


l! 


cost  to  move  from  Inactive  cost  to  move  from 

g..  = storage  remote  level  to  highest  + highest  remote 

^ii  cost  ” 

remote  level  level  to  the  net 


network 


cost  to  move  from  net  to 


process  access 


cost  active  level  costs 

The  major  differences  in  this  equation  from  the  purely  local 
version  are  the  added  network  costs  and  the  distinction  between  local 
and  remote  charging  rates.  Otherwise  most  of  the  terms  are  special 
cases  of  the  original  and  we  will  not  discuss  them  in  detail. 

The  network  costs  consist  of  two  major  components:  the  set-up 

costs  for  using  the  network  and  the  cost  of  the  traffic  sent  on  the 
network . 


network  costs  = de{  (m  + m,  ) t . + (M  + M,  ) t Hi  + T(X)} 

r L nd  r L np 

+ (1  + X)  (SKnj^/bj^)d 
+ 2en^d{l  + r(X)l 


The  first  term  is  the  cost  of  setting  up  the  transfers  in 
terms  of  the  number  of  message  exchanges  required  (protocol  negoti- 
ation), network  delay  and  protocol  processing.  The  other  two  terms  are 
network  charges  for  the  packets  actually  sent.  The  first  of  these  is 
the  cost  for  the  data  sent  and  the  second  is  for  the  messages  sent  for 
the  set-up  negotiation.  The  constant  K in  the  first  term  is  a "com- 
pression" factor  to  allow  inclusion  of  data  compression  and  protocol 
overhead  in  data  transmission  (headers,  restart  markers,  etc.).  T1h> 
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i 


I 

i 

i 


“nd 


np 


K 

t 


nt 


m 


M ^ 
r 

b,  ^ 


number  of  message  exchanges  necessary  to  set  up  the  transfer 
= message  round  trip  delay  time  in  the  network 

= CPU  time  for  protocol  overhead  (on  a per  protocol  message  basis 
"compression"  factor 
= network  CPU  time  to  receive  data 
= network  CPU  time  to  transmit  data 
remote  channel  cost 
local  channel  cost 
adjusted  remote  system  cost 
adjusted  local  system  cost 

number  of  data  set  copies  necessary  to  achieve  a desirable 
level  of  reliability 

network  transmission  cost 

unadjusted  remote  system  cost 

unadjusted  local  system  cost 

network  packet  size 


) 


Table  2 

Supplementary  Parameter  List  for  Network  Model 


i 


I 


i 
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(1)  .storaK<“  cost 


= 1 r . n . + T . n . Is  + 

1 I 11  .1  J 

k k 

(1  + \)ci{  (SK/b  ) (m  b /t,  + u b,/t^  ]}  + (2)  cost  to  move  between 

K n K s r K s 

highest  remote  level 
and  net 

(1  + \)d{M  W + (S/b.)[m  t + (m  b . / 1 * ) + ( i)  cost  to  move  between 

1.  i r L r 1 s 

£ . inactive  level  i and 

)l  + (m^S/B^)t^^}  + liighest  remote  level 

deHm^  + >’(^)'  + (4)  protocol  si't  up  cost 

-'-'“iikdll  + r(\);  + (5)  network  charges  for 

protocol  messages 

(1  + ')(SKnj^/b|^)d  + (h)  data  transfer  netwiirk 

costs 

+ ''l  *'nr^  ^ netw<'rk  softw  :irc  cost 

to  send  dnta  and 
receive  it 

m^q[(t^'/e)  + (s/t^’)l  + nijr[t^’  + + C8)  costs  for  random 

and  sequential  access 
and  for  retrieval  from 
active  location 

"l  n [ ( t] ' /P)  + (s/t^  )|  + u^r[tj  + (s/t^’^)  + (9)  channel  costs  for 

local  retrieval 

(I  + X)di  SK/b  . I (mj  b|^/t^*^)  + (u  ^ bj^/ 1 ] } (10)  cost  to  move  from  net 

buffers  to  active 
d ev i c e 

Figure  2 

Objective  Function  for  the  Network  Model 


transmission  cost  of  the  network,  n^^,  is  calculated  in  terms  of  packets 
sent,  a charging  structure  in  use  in  the  commercial  domain.  (It  should 
be  noted  that  the  symbols  with  the  subscript  k do  not  refer  to  the 
properties  of  the  highest  remote  level  of  the  hierarchy  but  to  proper- 
ties of  the  network,  such  as  transmission  rate,  packet  size,  etc.) 
Factors  involving  X are  included  in  the  network  costs  to  take  account  of 
the  possibility  of  shipping  the  data  back  to  inactive  store.  Notice 
that  a transfer  must  be  set  up  no  matter  how  small  an  amount  is  sent 
back  - hence  the  appearance  of  r(X)  in  the  formula. 

Example . Consider  a situation  in  which  there  is  a four-level 
hierarchy  (core,  drum,  disk,  and  archive),  both  locally  and  at  a remote 
site.  Assume  that  values  of  the  relevant  parameters  are  as  given  in 
Table  3 (taken  from  Lum  et  al.  [1975])  and  that  they  are  the  same  at 
both  sites. 


Parameter 

Core 

Drum 

Disk 

Archive 

Uni  ts 

i 

, .,-6 

-3 

. -3 

t 

10 

5 X 10 

60  X 10 

5 

second 

r 

t ^ 
s 

0-* 

10^ 

3 X 10^ 

4 

5 X 10 

byte/sec 

i 

-3 

-3 

-3 

t 

0 

8 X 10 

13  X 10 

25  X 10  ^ 

second 

q 

i 

-3 

-3 

-3 

0 

8 X 10 

12  X 10 

20  X 10 

second 

i 

-3 

-3 

t 

c 

0 

0 

25  X 10 

40  X 10 

second 

n , 

2 X 10"^ 

-4 

5 X 10 

3 X 10"'’ 

3 X lO"^ 

$/byte/ 

month 

* 

20,000 

7,000 

2,000 

byte 

* 

4 X 10^ 

140,000 

10,000 

byte 

* Irrelevant 

Table  3 

Parameters  for  Storage  Hierarchy 
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It  (loos  not,  of  course,  make  sense  tci  consider  inactive  storage  at 

remote  core,  and  this  case  is  omitted.  Let  the  number  of  local  buffers 

he  two  (P  = 2)  and  assume  that  there  is  no  setup  time  to  open  a data  set 

(W  = 0).  Suppose  that  a data  set  of  10  bytes  is  active  for  one  elght- 

liour  shift  per  day,  so  that  on  a per-month  basis  d = 30  (i.e.,  the  data 

set  is  opened  once  per  day).  Furthermore,  the  set  is  then  active  1/3 

of  the  time  (t.  = 1/3),  and  we  shall  assume  that  t.  = 1 (i.e.,  that  the 
J 1 

set  is  permanently  resident  at  the  inactive  location).  Let  the  set  be 
blocked  into  1500-byte  physical  records  (s  = 1500)  and  suppose  that 
' = 1 (so  that  the  data  set  is  always  written  back  at  the  end  of  each 
day).  Finally  assume  that  there  are  90,000  sequential  accesses  to  the 
active  copy  per  month  and  210,000  random  accesses  (i.e.,  q = 90,000  and 
r = 210,000).  These  values  all  correspond  to  those  used  bv  Lum  et  al. 
in  their  example. 

Next,  network  parameters  are  needed.  We  have  taken  b,  = 125 

k 

k 3 

bvtes,  the  ARPANF.T  packet  size;  t , = 200  ms  and  t = 5 x 10  bvtes/sec, 

nd  s 

both  ARPANF.T  figures;  t = 1 ms , which  is  roughlv  the  time  for  an  ARPA 

np  ■ 

NCP  to  handle  one  protocol  command  (including  response);  t = 1 ms,  an 

nr 

average  figure  which  runs  from  about  .5  ms  NCP  time  to  2 ms  if  the 

process  must  be  awakened;  and  t =2  ms,  which  consists  of  about  1 ms 

nt 

to  get  to  the  NCP  and  0.5  to  1 ms  to  use  it.  (These  estimates  for  t 

np 

t , and  t were  supplied  to  us  by  G.  Grossman  of  the  Center  for 
nr  nt 

Advanced  Computation.)  It  should  be  noted  that  both  t and  t should 

nr  nt 

be  slightly  larger  to  allow  for  data  processing  by  the  file  transfer 
protoc('l.  This  is  particularly  true  if  data  compression  is  being 
carried  out.  Rut  for  this  example  we  initial Iv  assume  K = 1.  Also,  t 

nr 

and  t^^^  as  given  are  times  per  message;  we  have  divided  by  8 to  get  a 
per-packet  estimatt',  since  a maximum  of  8 packets  per  mc'ssage  is 

11 


allowed.  The  parameter  e was  set  at  15.  This  is  arrived  at  as  follows. 


In  the  ARPANET,  It  requires  7 exchanges  to  open  an  FTP  connection,  plus 
from  4 to  7 commands  to  set  parameters  and  '3  more  to  open  the  data 
connection.  It  should  be  noted  that  by  using  ARPANET  data  and  the 
values  supplied  by  Crossman  we  are  essentially  computing  lower  bounds  on 
network  costs.  In  other  environments  the  network  costs  will  be  higher 
and  results  are  likely  to  be  quite  different. 

Finally,  cost  estimates  are  needed.  For  network  transmission 
we  assumed  n^^  = $1.25  per  1000  packets,  a quoted  Telenet  commercial  rate. 
To  begin  with  we  have  assumed  that  m^  = m^  = $10/hr.,  = $100/hr., 

and  Uj  = u^  = $8/hr.  Clearly  under  these  assumptions  remote  storage 
will  not  be  cost  effective;  but  by  adjusting  the  cost  of  the  remote  site 
rel.it ive  to  that  locally,  we  should  reach  a point  where  remote  storage 
is  che.iper.  The  values  calculated  for  costs  c.^  (see  Figures  1 and  2)  arc 
given  in  Table  4.  As  expected,  remote  storage  is  not  economical  for  the 
assumi-d  cost  structure.  The  cheapest  method  is  for  the  inactive  data  to 
be  stored  on  loc.il  .irchive  and  transferred  to  local  disk  when  active. 


Active  bocation  (j) 

bocal 

Core 

boca  1 
Drum 

bocal 
D isk 

boca  1 
Archive 

bocal  Core 

2000 

c 

bocal  Drum 

717 

50.0 

bocal  Disk 

670 

19.8 

3.05 

c 

bocal  Archive 

668 

17.5 

1.91 

3.01 

Remote  Drum 

724 

73.7 

58.  1 

60.0 

Remote  Disk 

677 

26.9 

11.3 

13.2 

Remote  Archive 

675 

24.9 

9.27 

11.2 

Table  4 

Computed  values  of  total  costs  c..  for  the  b.isic  example. 
Entries  are  in  thousands  of  dollars  per  month. 
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In  an  attempt  to  determine  for  what  relative  costs  it  becomes 


cost-ef f cct ive  to  store  remotely,  we  recomputed  tiu-  c..'s  for  a decreasing 

sequence  of  values  of  M , m , and  u . All  other  parameters  were  kept 
1 r r r 

the  same.  Even  when  the  cost  ratio  Z was 

m M u 
Z — = ^ = — =0.1 

"h. 

tlie  best  strategy  was  still  to  store  on  local  archive  and  transfer  to 
local  disk.  At  this  point,  however,  the  best  remote  strategy  (remote 
archive  to  local  disk)  was  less  than  twice  as  expensive  as  local  archive 
to  local  disk  (compared  with  a factor  of  more  than  4 in  the  Tahlt=‘  4 
example).  Closer  examination  of  the  individual  terms  computed  showed 
that  what  keeps  remote  storage  from  becoming  cost  effective  are  fairlv 
large  contributions  from  terms  (2)  and  (6)  (cost  to  move  from  highest 
remote  level  to  net  and  data  transfer  network  costs,  see  Figure  2).  In 

g 

short,  shipping  a data  base  of  10  bytes  back  and  forth  across  a network 
daily  is  just  not  likely  to  be  cost  effective  under  most  conditions! 

Tf  costs  for  shipping  through  the  network  are,  as  it  appears, 
making  remote  storage  uneconomical,  compression  of  the  data  before 
shipment  should  help.  We  therefore  inserted  a compression  factor 
K = 0.1  (about  as  small  as  is  realistic)  into  the  model  and  recomputed 
the  i- . . for  cost  ratio  Z = 0.1,  and  all  other  parameters  tlu*  same  as  thir 
the  Table  4 example.  Remote  storage  now  becomes  cost  effoctiv('  - the 
bi'St  strategy  is  to  transfer  from  remote  archive  to  local  disk.  (Sec 
Tabic  S . ) The  reader  should  keep  in  mind  througliout  this  discussion 
that  the  numbers  and  comparisons  given  here  sliould  not  he  taki'n  t o(i 
literally.  The  simplistic  h i erarctii ca 1 storage  model  we  arc  using  docs 
in>t  take  into  account,  for  example,  cost  advantages  whicli  mav  oci-ur  diu' 


to  remote  data  process ing. 


Table  5 


Computed  values  of  c..  for  K = 0.1,  cost  ratio  Z = 0.1. 

1,1 

Entries  are  in  thousands  of  dollars  per  month. 


Starting  at  this  point,  we  Increased  Z (since  a ten-to-1  cost 
ratio  is  probably  unrealistic)  to  see  at  wbat  value  of  Z remote  storage 
begins  to  become  cost  effective  (for  K = 0.1).  Throughout  the  range  of 
Z values,  the  best  local  strategy  is  archive  to  disk;  the  best  remote 
strategy  is  remote  archive  to  disk.  We  have  graphed  the  costs  of  these, 
versus  Z,  in  Figure  3.  The  local  strategy  cost  is,  of  course,  independent 
of  K or  of  remote  costs  (and  hence  of  changes  in  Z) . Notice  that  the 
crossover  occurs  at  Z = 0.3  - a value  which  might  well  occur  in  practice. 
Another  interesting  feature  is  the  linearity  of  the  curve,  which  m;iy 
make  the  model  more  useful  as  input  into  decision  algorithms. 

An  unexpected  result  was  that  decreasing  S (the  data  base 
size)  to  10  bytes  and  then  to  10  bytes  led  to  virtually  no  change  in 
this  crossover  value  of  Z.  Even  at  10  bytes  the  local  and  remote  best 
strategies  are  nearly  of  equal  cost  at  Z = 0.3.  Hut  for  this  small  a 


Figuri"  i 


(U)mp;ir<'it  ivo  costs  (in  thousnnds  of  ciollnrs  per  montli) 

Best  locnl  strategy 

Rost  ri'moto  strategies 
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disk.  For  larger  data  bases  (10  bytes  or  more),  however,  the  main 
costs  are  those  for  storage,  and  the  minimum  in  the  matrix  corresponds 
to  permanent  storage  in  local  archive  (no  staging) . 


Thougli  insensitive  to  S,  it  is  clear  that  the  crossover  point 
j_s  sensitive  to  K.  To  investigate  this  feature  further,  we  generated 
the  second  curve  in  Figure  3,  for  which  the  only  difference  in  para- 
meters is  that  K = 0.2.  As  expected,  the  crossover  point  has  decreased, 
and  to  about  Z = 0.17.  To  a good  approximation,  as  we  decrease  the 
amount  of  compression,  the  remote  costs  must  decrease  proportionately 
for  remote  storage  to  remain  cost  effective.  (A  quick  check  showed  the 
trend  holding  for  K = 0.5.  In  ttiis  case  remote  storage  is  almost  - but 

not  quite  - cost  effective  for  Z = 0.1.)  ■ 

i 

In  conclusion,  we  have  seen  that  remote  storage  of  even  very  ^ 

large  data  bases  may  be  economical,  providing  ttie  data  is  shipped  com- 

I 

pressed  and  there  is  a sufficient  differentia]  between  local  and  remote 

costs.  However,  it  perhaps  is  not  reasonable  to  assume  that  the  wliole  ' 

data  base  is  transferred. 

A more  realistic  model  would  allow  for  transferring  only  a 
portion  of  the  data.  One  approacti  would  be  to  transfer  a block  of  data  | 

only  when  needed.  Suppose  it  is  assumed  that  each  access  request  ' 

initiates  a transfer  of  the  relevant  block  or  blocks.  This  supposition, 
tiowever,  contradicts  the  wliole  basis  of  the  present  staging  model  - 
namely,  that  the  data  base  is  transferred  from  inactive  tc>  active 
storage  and  tht’n  accessed  on  a block  by  block  b.asis.  A compromise  can 
be  achieved  by  .assuming  that  only  ;i  portion  of  the  entire  d.it.a  base  is 

lb  ! 


staged  d.iily,  as  discussed  below. 


Revised  Model:  Part  iai  Staginy; 

Fn  this  paragrapli,  we  will  consider  the  effects  of  altering 
tlte  model  to  take  account  of  tlie  possibility  that  only  a part  of  the 
data  base  is  transferred  from  inactive  to  active  storage.  We  introduce 
a new  parameter; 

S'  = size  of  data  set  transferred. 

I'hen  tlie  following  clianges  are  to  be  made  in  the  equations: 

In  Figure  1,  the  storage  cost  becomes  + r.n.S',  and  in 

the  last  term  S is  replaced  by  S'. 

In  Figure  2,  the  storage  cost  (term  (1))  is  ctianged  just  as  in 
Figure  1.  In  addition,  all  other  occurrences  of  S are  changed  to  S'. 

Figure  4 shows  tFie  results  of  some  computations  for  partial 
staging.  Tile  parameters  chosen  were  such  that  results  arc  directly 
comparable  to  the  K = 0.2  curve  in  Figure  J.  The  absolute  costs  liave 
of  course  decreased  considerably.  However,  the  interesting  feature 
to  notice  is  that  the  crossover  point  is  virtually  unchanged  from 
when  the  entire  data  base  is  transferred.  This  seems  to  be  anotlu'r 
aspect  of  the  relative  insensitivity  of  cost  effectiveness  to  cliangi's 
in  data  base  size. 

Ajiji  1 i cation  of  the  Model  to  Multi-site  Usage 

There  is  another  tvpe  of  strategy  question  that  may  re.idilv 
be  studied  by  using  the  model.  Suppose  users  at  two  sites  wisli  to  use 
the  data  base,  but  it  will  be  used  more  heavily  at  one  (Site  A)  than 
at  till'  other  (Site  11).  Sliould  Site  11  use  Site  A's  copy  or  stori'  its 
own  locally?  In  this  section  we  give  an  example  of  this  tvpe  of  problem 
and  i t s so  I ut ion . 
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Suppose  that  the  data  base  is  10  bytes  in  size,  and  suppose 

that  costs  and  other  parameters  (except  those  describing  usage)  are 

the  same  for  both  sites  and  are  those  assumed  in  the  computation  of 

the  Table  4 entries.  I.et  the  usage  at  Site  A also  be  the  same  as  was 

assumed  in  computing  Table  4.  Then  we  know  that  the  best  strategy  from 

Site  A's  point  of  view  is  to  store  the  data  on  local  archive  and  stage 

it  to  local  disk.  We  therefore  assume  that  this  is  done,  at  a cost 

of  $1,914  per  month  (from  the  computation  for  Table  4). 

Now  suppose  that  Site  B only  uses  10  percent  of  the  data 

base  (perhaps  a different  10  percent  on  different  days)  and  that  Site  B 

performs  far  fewer  accesses,  say  again  by  a factor  of  10.  We  now  rerun 

the  model  to  obtain  the  c..  matrix  from  Site  B's  point  of  view.  The 

1.1 

parameter  changes  to  do  this  are:  S = 10^,  q = 9000,  r = 21,000,  and 

= 0.  (This  last  change  is  made  because  we  assume  that  Site  A makes 

all  the  changes  in  the  data;  Site  B just  retrieves  it.)  The  resulting 

matrix  of  c..  values  is  shown  in  Table  6.  (We  have  omitted  the  "local 
1.1 

core"  column  here  because  the  storage  options  involving  core  are  too 
expensive  to  be  interesting.) 


Active  Location  (j) 

Loca  1 

Local 

Local 

•iH 

Drum 

Di  sk 

Archive 

c 

Local  Drum 

SOOl 

c 

Local  Disk 

1974 

30$ 

u 

q 

Local  Arcliive 

1712 

1 $0 

301 

0; 

Remote  Drum 

7021 

$458 

$652 

> 

u 

Remote  Disk 

2 329 

767 

900 

C 

Remote  Archive 

2081 

$18 

712 

Table  b 


Matrix  of  costs  c,.  for  Site  B.  (See  text.! 
1 1 

I'.ntries  are  in  dollars  per  month. 
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From  the  table,  we  see  that  Site  B's  best  local  strategy  is  to  store  on 
archive  and  stage  to  disk,  at  a cost  of  $150.  Furthermore  we  see  that 
the  best  strategy  involving  A's  archive  as  inactive  storage  is  to  stage 
the  data  to  disk,  at  a cost  of  $518.  This  includes,  however,  a cost  of 
$J  per  month  to  store  10^  bytes  in  A's  archive  and  this  storage  cost  is 
already  assumed  to  be  paid  for  by  A.  Thus  the  net  cost  to  B is  $515  per 
month.  Finally  we  make  the  comparison: 

Total  cost,  storage  at  both  A and  B = $2064  per  month. 

Total  cost,  storage  at  A only  = $2429  per  month. 

Not  surprisingly  (since  the  cost  of  storage  itself  is  so  small)  the 
first  option  is  the  cheaper.  However,  the  increased  cost  of  the  second 
option  is  only  $355  or  17  percent.  This  may  be  very  much  worthwhile  in 
view  of  the  problems  that  arise  in  trying  to  keep  more  than  one  copy  up 
to  date.  Furthermore,  this  computation  was  carried  out  with  K = 1 (no 
data  compression).  If  the  data  is  transferred  in  compressed  form,  sav 
K = .25,  site  B's  best  local  strategy  is  as  before.  However,  the  best 
strategy  involving  A's  archive  as  inactive  storage  is  to  stage  the  data 
to  disk,  at  a cost  of  $258  (subtracting  off  the  duplicated  storage  costs 
as  before).  Thus  the  total  cost  of  the  second  option  is  $2172  iH>r 
month,  which  is  an  increase  of  only  $108  or  5 percent. 

PJans  for  Fii^rther  Work 

Clearly,  much  more  can  be  learned  by  experimentation  with  the 
present  model.  By  using  parameters  tiiat  describe  specific  systems  and 
their  costs,  we  sliould  be  able  to  develop  cost  comparisons  for  important 
real  applications.  in  addition,  we  have  looked  careful  Iv’  .at  the  effi'cts 
of  varying  only  a few  of  the  m.any  parameters  in  the  model.  By  varying 
oth(-rs,  we  should  g.iin  further  insights  into  costs. 
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We  might  also  investigate  other  approaches  to  deciding  on  a 
"best"  storage  policy  which  might  be  relevant  in  some  situations.  For 
example,  since  protocol  implementations  reside  as  user-level  processes 
in  many  operating  systems,  and  since  it  is  often  useful  to  consider  the 
data  set  as  being  staged  in  the  remote  system,  it  might  be  interesting 
to  consider  an  alternative  approach  which  runs  as  follows.  The  data  set 
allocations  on  the  remote  site  are  determined  according  to  the  LSWl, 
model,  and  the  lowest-cost  strategy  is  selected.  The  cost  of  this 
strategy  plus  the  relevant  network  costs  are  then  used  to  form  the 
lowest  level  of  the  local  hierarchy,  where  the  cost  for  the  local  levels 
is  computed  using  the  LSWl.  model  and  the  last  level  (the  remote  one) 
uses  a sliglitly  modified  form.  Further  study  is  needed  to  determine 
whether  this  approach  will  yield  useful  data  for  decision  making. 

I'here  are  a number  of  refinements  that  could  be  added  to  the 
model.  We  list  a few  of  these  here. 

1)  There  could  be  a provision  for  allowing  some  fraction  of  the 
queries  to  be  answered  locally,  while  the  rest  require  remote 
access.  (This  feature  may  be  useful  in  analyzing  the  cost 

('f feet  iveness  of  Intelligent  terminals  or  network  front-ends.) 

2)  The  effects  of  the  finite  size  of  the  storage  devices  might  be 
inc  1 tided . 

1)  As  mentioned  earlier,  the  definition  of  the  atijusled  svstem 

cost  does  not  appear  to  reflect  the  effects  of  increased  load 
on  the  system.  This  point  requires  more  invi'st  igat ion  to  gain 
a better  understanding  of  this  parameter  and  I'f  how,  if 
necessary,  svstem  loads  may  be  inserted  into  llu'  model . 

■i ) The  model  developed  by  hum  et  al.  was  intemled  to  represent 


file  migration  or  data  staging.  Thus,  when  a data  set  is 


written  back  to  the  inactive  device,  the  operation  is  con- 


sidered to  be  symmetrical  to  the  original  read.  If  this  model 
is  to  be  an  accurate  characterization  of  a data  management 
system,  it  will  be  necessary  to  Include  the  cost  of  performing 
updates . 

5)  Since  data  case  reliability  appears  to  be  one  of  tlie  major 
advantages  of  distributing,  it  is  very  important  that  the 
model  be  capable  of  evaluating  tbe  cost  of  various  multi-copy 
backup  schemes  with  respect  to  the  level  of  reliability  they 
provide.  We  have  therefore  provided  a parameter,  N,  to 
indicate  how  many  copies  exist  in  the  network.  Unfortunately, 
we  have  not  yet  determined  how  this  parameter  should  be 
inserted  into  the  model. 
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A Model  for  Distributed  Data  Availability 


1 lU  rodtic  t ion 

In  this  section  we  attempt  to  quantify  the  improvement  in  data 
base  availability  which  can  be  achieved  by  storing  a backup  copy  at  one 
(or  more)  remote  sites  in  a network.  We  also  discuss  the  practicality 
of  certain  alternative  management  strategies. 

Availability  is  defined  as  the  probability  that  at  least  one 
copv  of  the  data  base  is  up  and  usable  as  a master  copy  for  queries  and 
updates.  Alternatively,  availability  can  be  thought  of  as  the  fraction 
of  time  that  the  data  base  is  expected  to  be  available  for  use. 

To  simplify  the  analysis,  we  will  not  consider  various  possible 
causes  of  data  base  failure,  but  will  assume  that  the  data  is  available 
when  the  host  computer  is.  Furthermore,  we  will  not  take  into  account 
scheduled  down  time  of  the  host  computer,  on  the  assumption  that  if  down 
time  is  scheduled,  transfer  to  a backup  copy  is  automatic  and  immediate, 
and  leads  to  no  loss  in  availability.  (The  very  existence  of  a backup 
c'opy  at  .an  alternate  network  site  will  of  ci'urse  improve  availability 
considerably  over  the  case  where  only  one  site  has  a copy.) 

The  Model 

Parameters.  The  parameters  in  the  model  are  as  follows: 

F = mean  time  between  cominiter  failures,  .assumed  to  be  the  same 
for  .all  host  computers. 

X = expected  time  to  repair  computer. 

1.  = expected  time  to  lo.ad  the  d.at.a  b.ase  copy  .at  the  reiiKUi'  site. 

Y = time  th.at  the  audit  tr.ail  of  updates  has  been  growing  (i.e., 
time  since  the  i-opy  w.as  correct). 


k = the  ratio  of  update  arrival  rate  to  update  processing  rate, 
so  that  kY  = time  to  process  the  audit  trail.* 

D = time  delay  between  when  the  master  fails  and  when  the  remote 
site  determines  this  fact  and  starts  to  get  its  copy  ready 
for  use. 

The  equations.  First,  consider  the  case  where  there  is  a 
single  copy  of  the  data  base.  The  availability  of  this  copy  is  then 

\ = F 

' o F + X + kX 

This  is  the  usual  formula  for  availability  (mean  time  between  failures 
divided  by  mean  time  between  failures  plus  mean  time  to  recover),  where 
tlie  mean  time  to  recover  includes  repair  time  X plus  the  time  kX  to 
process  the  updates  accumulated  while  repairs  were  made.  (This  formula 
for  recovery  time  is  that  used  by  Chandy  et  al.  [1975].)  There  is  a 
question  as  to  whether  the  term  kX  should  be  included  here,  since  the 
site  is  technically  "up"  after  time  X.  But  in  a network  setting,  it 
does  seem  appropriate  to  assume  that  updates  initiated  at  remote  sites 
are  being  logged  somewhere,  so  that  there  does  exist  an  update  list  to 
be  processed.  In  addition,  we  are  interested  primarily  in  comparing 
with  availabilities  computed  for  multi-copy  strategies,  where  the  copies 
are  assumed  to  be  up  to  date. 

Consider  Strategy  I for  transferring  usage  back  and  forth  between 
master  copv  and  backup  copy.  After  the  master  copy  is  determined  to 
h.ive  failed,  the  remoti-  copy  is  then  brought  up  (after  a time  lapse  of 
D + 1.  + kY)  and  usage  is  transf(’rred  to  it.  Meanwhile  the  old  master  is 

* The  parameter  k is  referrisi  t ('  In  the  literature  as  a "ccimpress  i on" 
factor  [Chandy  et  al.,  197A1.  This  is  not  to  he  confused  with  the 
usual  data  compression  factor  ilenoteii  by  K in  the  previous  section. 


bein;;  repaired.  Queries  and  updates  are  sent  to  the  new  master,  how- 
ever, until  it  fails,  at  which  time  the  process  repeats:  another  "new" 

master  is  identified  and  activated.  (This  may  or  may  not  be  the  "old" 
master.)  This  strateRy  is  diagrammed  (but  not  to  scale)  in  Figure  5. 
Since  the  remote  site  may  have  been  up  for  some  time  since  its  last 
failure,  it  is  assumed  that,  after  the  data  base  comes  up,  the  expected 
time  until  failure  is  only  F/2.  (Actually  a smaller  number  may  be  more 
reasonable,  since  some  host  time  has  already  been  spent  in  the  recovery 
operation.)  Notice  that  an  obvious  built-in  assumption  can  be  read  from 
the  figure. 

(1)  D + L + kY  < X + kX 

If  this  inequality  is  not  satisifed,  it  theoretically  does  not  pay  to 
store  a remote  copy,  since  Llie  master  is  expected  to  be  repaired  and 
updated  before  the  remote  copy  can  be  activated.  The  furmula  for  avail- 
ability under  Strategy  1 can  then  be  read  off  Figure  5 as 

. ^ _F . 

1 2D  + 2L  + 2kY  + F 

Strategy  2 is  to  immediately  replace  the  copy  by  the  old 
master  as  soon  as  the  latter  has  been  brouglit  back  up.  This  scheme  is 
diagrammed  in  Figure  b.  Again,  inequality  (1)  must  hold  in  order  for 
the  diagram  to  be  meaningful.  There  is  an  additional  assumption  which 
must  be  made  in  order  for  our  model  of  either  strategy  to  be  valiil. 

This  assumption  is  that  D + 1,  + kY  is  sufficiently  small  compared  t('  F 
that  there  is  little  likelihood  of  a failure  of  the  remote  host  during 
the  recovery  process.  In  addition,  Strategy  2 requires  that 

(2)  X + kX  < j' 

It  this  is  violated,  there'  is  a good  probability  that  the  C(ipy  m.iv  fail 
before  the  master  is  ready.  For  reasonable  values  of  F,  however. 


4 b 


p*  F/2-^D* 

F/2 


inequality  (2)  is  readily  satisfied;  it  is  inequality  (1)  that  must  be 
carefully  checked  in  using  the  model.  Finally,  the  availability  formula 


can  be  read  from  the  diagram: 

D + L + kY. 

^2  ^ F + X + kX 

Keeping  inequality  (1)  in  mind  and  comparing  formulas,  we  see  that 
Strategy  1 is  generally  poorer  than  Strategy  2,  and  indeed  is  often 
less  than  . We  will  therefore  restrict  consideration  to  Strategy  2. 

Sensitivity  to  parameter  values.  In  any  model,  it  is  useful 
to  determine  how  sensitive  the  output  values  are  to  changes  in  the 
inputs.  Obviously,  the  inputs  are  only  known  approximately  or  are 
statistical  averages.  If  the  output  changes  drastically  for  a small 
change  in  an  input  value,  the  model  is  rather  useless  for  predictive  or 
decision  purposes.  Chandy  et  al.  [1975]  use  the  elasticity  E(f,y), 
essentially  the  "percentage  change  in  f caused  by  a percentage  change  in 
v",  to  investigate  the  sensitivity  of  a function  f with  respect  to  a 
parameter  y.  Formally,  E is  defined  by 

9f  y;. 


E(f,y)  = 


3v  f 


We  have  investigated  the  elasticity  of  II  = 1 - ,\^  with  respect 
to  all  of  the  input  variables.  (Working  with  U instead  of  A^  simplifies 
the  .il'/ebra  without  changing  the  conclusion.)  We  find  that  for  all 
parameters 

1^'  XI  . 5 
' 3y  U ' 

For  example,  taking  y = k. 


3U  _ FY  + XY  - nx  - LX 


9k 


(F  + X + kX)‘ 
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And  for  y = Y, 


^ [ k(FY  + XY  - DX  - LX) 
3k‘u'  ' (F  + X + kX) (D  + L + kY) 

I kFY  + kXY  - ■ ■ . I ^ . 

' kFY  + kXY  + . . . ' ' 


i^.Xl  - kY  _ F + X + kX  ^ kY 

9z"ul  " F + X + kX  ‘ D + L + kY  D + L + kY 


Similar  computations  show  that  the  elasticities  of  U with  respect  to  D, 
h,  X,  and  F are  all  less  than  one.  Elasticities  of  U are  connected  to 
those  of  through 
3A 

' 3y  ' 3y ' ^ 3y ' U’ 

as  long  as  A^  > U.  We  may  conclude  therefore  that  our  model  is  stable, 
being  relatively  insensitive  to  small  changes  in  parameter  values. 
F.xperiments  and  Discussion 

Remote  journaling.  In  order  to  model  a remote  journaling 
process,  we  assume  that  the  parameter  Y is  large;  for  simplicity  we 
assume  that  it  is  equal  to  F.  Thus  we  are  essentially  assuming  that, 
whenever  the  master  comes  up  after  a failure,  a copy  of  the  up-to-date 
data  base  is  shipped  off  to  any  remote  site  which  contains  a copy  of  the 
data  base.  (Or  that  the  remote  data  base,  having  been  used  as  a master 
copy  while  the  master  was  down,  already  possesses  an  up-to-date  copv  at 
this  time.) 


It  is  interesting  to  note  that  journaling  remotely  hv  shipping 
the  data  base  over  the  network  is  not  feasible  on  a regular  basis.  For 
example,  consider  a data  base  of  4 x 10^  bytes  (roughly  FORSTAT  sizi'). 

At  a network  throughput  of  15  kilobits  per  second  (faster  than  normal 
for  the  ARPANET),  it  would  take  approximately  b hours  to  ship  a data 
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base  of  this  size.  Daily  backup  by,  say,  sending  tapes  by  courier 
would,  however,  be  feasible  in  many  situations. 

Tlie  data  copy  at  the  remote  site  will  be  generally  assumed  to 
be  on  tape.  The  value  1.  = 0.5  hr.  lias  been  assumed  in  the  computations 
since  it  is  approximately  the  time  to  read  two  to  three  tapes.  The 
parameter  D is  probably  on  the  order  of  one  or  two  seconds,  but  we  have 
taken  it  to  be  .01  hr.  as  an  absolute  upper  bound.  X = 1 seems  to  be  a 
reasonable  mean  value  for  repair  time.  With  these  parameters,  we  get 
the  following  formula  for  improvement  I in  availability  as  a function  of 
r and  k. 

= '^2  ~ '^o  ^ 0.49  + k(l  - F) 

A F 

o 

It  is  difficult  to  estimate  what  a reasonable  value  of  k should  be.  In 
a similar  analysis,  Chandy  et  al . [1975]  take  k = 1/8.  Clearly  the 
value  will  depend  on  the  usage  pattern  for  the  data  base;  doubtless  wavs 
of  measuring  it  for  a real  system  could  be  devised.  However,  notice 
that,  with  k = 1/8,  Inequality  (1)  states  that 
. 51  + F/8  < (1  + 1/8) . 

Hence  for  this  large  a k the  time  to  process  the  audit  trail  is  so  long 

that  the  master  is  able  to  get  up  before  the  backup  copy  whenever  F ''  4.‘i2 

hrs.,  which  is  an  unreasonably  low  value. 

To  get  a feel  for  the  value  of  remote  iournaling,  we  therefore 

Lake  k = .01;  i.e.,  we  assume  that  there  are  few  updates.  In  this  case 

inequality  (1)  restricts  the  model  to  F < 50.  A graph  of  1 vs.  F in 

this  case  may  be  seen  in  Figure  7.  Notice  that  f<'r  ri'asonahle  values  iif 

F the  improvement  in  availahility  is  less  than  5 percent  which  mav  not 

be  enough  to  make  remote  journaling  worthwhile.  Values  of  A havi-  .also 

o 

bi’en  plotted  in  the  figure  for  rt'ference. 
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Figure  7 

Single-site  availability  A and  fractional  improvement  1 
through  use  of  Strategy  2.  Parameters  are  k = 0.01, 

D = .01  hr.,  X = 1 hr.,  1.  = 0.5  hr.,  and  Y = F. 


Figure  8 

Same  as  Figure  7,  except  that  Y = 1 hr. 


SO 


As  a final  comment  on  tlie  remote  journaling  strategy  described 
here,  we  note  that  availability  may  actually  decrease  as  F increases. 

For  example,  suppose  X = 2,  k = 0.25,  1.  = 0.5  and  I)  = 0.  Theii  = .7692 
for  F = A and  = . 7647  when  F = 6.  Differentiating  A,^  (for  Y = F) 
witli  respect  to  F shows  tliat  this  decrease  will  o(U'ur  whenever 
k(k  + 1)X  ■■  D + I.. 

Intuit ivelv,  tliis  phenomenon  occurs  because  for  large  k the  effect  of 
tile  lengthening  audit  trail  to  be  processed  outweighs  that  of  the 
increasing  reliability  of  the  Itost  computer. 

Frecpient  ly  updated  remote  journal.  The  lack  of  effectiveness 
of  tlie  remote  journaling  strategy  described  in  the  last  section  seemed 
to  i'e  caust’d  by  tlie  necessity  of  processing  an  extremely  long  audit 
trail.  Suppose,  then,  that  we  drop  the  assumption  that  Y = F and  assume 
instead  that  the  remote  copy  is  piT i od i cal  1 y brmight  up  to  date.  As  an 
oxample,  we  might  assume  this  updating  to  take  place  every  two  hours,  so 
that  the  average  length  of  audit  trail  to  process  to  bring  up  tlu'  remote 
copv  is  1 hour.  With  all  other  parameters  as  specified  for  Figure  7, 
but  with  Y = 1 , 

T = .49/F. 

This  result,  which  is  graphed  in  Figure  8 is  independent  of  k (because  of 
the  cancelling  of  kX  and  kY  terms),  as  long  as  k and  F are  such  that  the 
moiiil  is  valid.  I'n  f or  t una  1 I v , the  improvement  is  still  general  Iv  less 
th.in  5 percent. 

Indeed,  the  curves  in  Figure::  7 and  8 are  nearly  idi'iitical. 

To  see  why  this  should  be  so,  consider  more  tioselv  the  formuia  for  i. 
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X + kX  - D j^J,  - 

p 
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A 


kY 


As  long  as  k is  small  (or  when  X = Y as  above)  it  is  clear  that 


Running  spares.  Here  we  assume  that  the  backup  copy  is  stored 
on  disk  for  virtually  instantaneous  access  and  is  kept  almost  up  to 
date.  Reasonable  parameters  for  tliis  case  might  be  L = 0,  Y = .1  lir., 
and  (for  comparison  with  the  results  above)  X = 1,  k = .01.  Then  we 
have 

, 0.999. 


We  will  not  bother  to  graph  this;  this  curve  looks  just  like  the  earlier 

ones,  only  the  values  of  I are  approximately  doubled . In  this  case, 

improvements  of  5 to  10  percent  are  seen  for  F between  10  and  20  - 

certainly  enough  to  make  the  strategy  worthwhile.  in  fact,  what  happens 

in  this  case  is  that,  under  our  assumptions,  availabilities  are  brought 

up  to  very  nearly  unity.  To  see  this,  note  that 

A = I - -01  + kY 

^2  F + (1  + k)X 


and  for  our  example  kY  - 0.001.  Increasing  k will  cause  somewhat  smaller 
values  of  A^,  but  A^  will  bo  over  99  percent  for  a wide  range  of  reason.ibh 
parameter  values. 

Kffect  of  varying  Y.  We  have  looked  at  three  separate  cases 
which  differ  from  one  another  in  large  part  in  the  widely  differing 
values  for  the  parameter  Y.  To  better  understand  the  effect  of  this 
parameter,  we  select  typical  values  of  the  other  p.ir.imeters  (X  = 1 , 

I,  = 0.5,  D = 0.01,  F = 20)  .and  consider  A,,  .is  a funct  ion  of  Y for 
sever.ai  different  values  of  k.  Wlien  k = .01,  we  have 

0.51  + O.OIY. 


A,,  = 1 - 


21.01 


V 


1 
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The  small  coefficient  of  Y in  this  case  makes  the  effect  of  Y minimal. 

As  Y ranges  between  0 and  20,  decreases  linearly  from  0.976  to  0.966. 
Now  suppose  that  k is  increased  to  0.05.  In  this  case  as  Y goes  from 
0 to  20,  A^  decreases  from  0.976  to  0.953  - still  not  a very  dramatic 
change!  To  a large  extent  what  makes  the  "running-spare"  approach  so 
worthwhile  is  not  the  small  value  of  Y but  the  instantaneous  access 
(I.  ; 0). 

Conclusions  and  Plans  for  Future  Work 

We  have  presented  here  a model  for  data  availability  which, 
while  superficial,  does  seem  to  reflect  tlie  realities  of  various  strate- 
gies for  backup.  We  have  seen  that  remote  journaling,  in  the  sense  of 
storing  a copy  in  archival  storage  (e.g.  tape)  at  a remote  site,  leads 
to  very  little  in  the  way  of  availability  improvement  - perhaps  5 per- 
cent at  best.  On  the  other  hand,  tlie  running  spares  strategy,  in  which 
the  remote  copy  is  nearly  up  to  date  and  almost  immediately  accessible, 
brings  availability  up  to  over  99  percent  and  appears  to  be  worthwhile. 
It  should  be  noted,  however,  that  the  running  spares  strategy  is  hound 
to  be  relatively  expensive.  Furthermore,  before  this  strategy  can  be 
effectively  used,  many  of  the  problems  of  multi-copy  management  must  be 
solved.  For  example,  updating  must  be  synchronized  in  order  for  t lu> 
backup  copv  to  be  effectively  kept  up  to  date. 

One  point  to  notice  about  the  model  is  the  importance  of  the 
parameter  k.  We  found  that  remote  journaling  was  t hooret  i ca 1 1 v iif  no 
v.ilue  unless  k was  fairlv  small.  The  parameter  k is  essmit  ial  Iv  a 
prop<'rt  iona  1 i t V factor,  determining  lu'w  long  it  takes  a proci'SS('r  1 1' 
"catch  up"  when  there  has  been  a backlog  t'f  updates  ac  cumu  1 a t i ng . I'he 


\ 
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value  of  k will  depend  on  many  factors  - the  rate  at  which  updates  are 
generated,  the  complexity  of  the  updating  procedure,  the  processor 
speed,  etc.  Some  of  these  factors  and  how  they  enter  into  k are  amen- 
able to  theoretical  study;  others  require  system  measurement. 

Another  feature  of  the  "catch-up"  time  deserves  some  thought. 
Is  kY  an  adequate  expression  for  this?  Or  should  one  then  add  on  k*kY 
to  take  account  of  the  updates  that  have  been  entered  while  the  first 
set  was  being  processed,  and  so  forth?  Adding  on  these  terms  would  add 
little  complexity  to  the  model;  but  it  seems  hardly  worthwhile  as  long 
as  k is  so  uncertain.  That  is,  k as  an  effective  proportionality  con- 
stant can  be  assumed  to  include  the  effects  of  the  higher  order  terms. 

Finally,  further  work  on  this  model  should  include  some  care- 
ful statistical  analysis  of  a number  of  questions.  Wtiat  is  the  proba- 
bility that  a host  will  fail  during  the  recovery  process?  Wliat  is  the 
probability  that  a "new"  master  copy  will  fall  before  the  old  one  has 
been  repaired?  (In  both  of  these  cases  more  than  two  copies  would  be 
advantageous.)  How  many  copies  are  needed  to  achieve  a given  level  of 
availability?  Is  there  some  "optimum"  number  of  copies?  In  short, 
there  are  a number  of  interesting  questions  which  can  be  addressed  if 
the  parameters  in  the  model  are  considered  to  be  random  variables 
instead  of  simple  average  values. 


I 
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A Response-Time  Model  for  Distributed  Data 


1 ntjTKlue^t  icni 

The  hypothesis  to  be  studied  in  this  seetion  is  that,  because 
of  disparity  in  site  loads,  sending  queries  to  a remote  site  may  improve 
response  time,  in  spite  of  network  delays.  (Response  ^iirie  we  define 
here  to  be  the  lengtii  of  time  between  when  the  user  inputs  a query  and 
when  he  receives  the  response.)  We  assume  that  the  data  base  is  equally 
available  (stored  on  disk  and  kept  up  to  date)  at  all  of  the  alternate 
sites.  We  also  will  ignore  such  things  as  the  effect  of  increased 
ava  i 1 ,ab  i 1 i ty  on  real  response  time.  (That  is,  if  there  is  only  a single 
copy  and  that  goes  down  for  several  hours,  the  response  time  during  that 
period  is  clearly  very  poor.  But  this  effect  is  hard  to  include  in  our 
model  in  its  present  primitive  state.) 

The  problems  which  arise  in  trying  to  develop  a model  of  this 
type  .ire  extremely  difficult.  First  of  .all,  the  question  of  how  machine 
"load"  is  to  be  defined  and  measured  has  never  been  satisfactorily 
resolvi'd.  We  are  forced  to  simply  assume  th.at  there  is  such  a quantity 
.ami  th.at  it  incre.ases  in  proportion  to  tlu'  number  of  jobs  in  the  system. 
Second,  it  is  uncert.ain  as  to  how  response  time  is  affi’cted  by  system 
lo.ad.  The  most  relevant  work  that  wi'  have  been  .able  to  find  in  the 
lilirature  is  in  Schorr's  monogr.aph  fScherr,  1967|  on  time-shared 
systems.  Schorr  carried  out  both  theoretic.il  and  (,'xper  iment  a 1 studii>s 
of  response  time  .as  a function  of  the  number  of  users  on  the  system.  Ih' 
found  th.at,  for  .a  sm.all  number  of  users,  the  nssponse  t inu.'  is  ne.irly 
const.ant,  showing  only  a slow  rise  .as  the  numbir  of  users  incre.ases. 


At  a certain  point  (defined  as  the  "saturation"  level)  the  response  curve 
takes  a sharp  upward  turn,  rising  linearly  with  number  of  users  there- 


after. This  general  shape  is  pictured  below. 


Response 

rime 
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Since  Scherr  assumes  that  the  users  are  keeping  busy,  it  seems  to 
a valid  assumption  that  response  time  will  also  Increase  linearly 


be 

with 
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load,  when  the  load  is  reasonably  heavy.  That  is,  we  assume  that  the 
region  of  this  curve  ttiat  is  pertinent  to  our  study  of  tlie  advantages 
of  data  distribution  is  the  steep  linear  rise. 

The  Model 


Parameters . The  parameters  in  the  model  are  as  follows: 

N = number  of  computers  in  the  network  which  possess  a copy  of  the 
j given  data  base. 

n.  = that  part  of  the  load  at  computer  i which  is  not  related  to 
I data  base  use. 

t' 

! V = number  of  updates  per  unit  time. 

I 

H = numbor  of  queries  per  unit  time. 


i 


I 
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V = load  induced  by  an  update  rate  V = I (so  that  Vv  is  the  total 
load  due  to  updates). 

h = load  induced  by  a query  rate  H = 1 (so  that  Hh  is  the  total 
load  due  to  queries). 

a, 5 = parameters  describing  tlie  linear  Increase  in  response  time 
as  load  increases.  That  is,  at  a single  site. 

Response  time  = a(load  - S.)  . 

We  assume  these  parameters  are  the  same  for  all  sites. 

= increase  in  response  time  due  to  network  delays  and  overhead 
of  sending  a query  to  a remote  site. 

The  equations.  Suppose,  for  simplicity,  that  all  queries  are 
entered  at  a single  site  (at  computer  1,  say)  that  possesses  a copy  of 
the  data  base.  If  the  site  opts  to  respond  to  the  entire  query  load 
itself,  then  its  total  load  is 
Vv  + Hh  + Gj. 

The  single-site  response  time  is  then  given  by 

R = a(Vv  + Hh  + G,  - 1). 
s 1 

Now  if  the  site  decides  to  distribute  the  queries  equally  among  tlte  N 
sites  which  have  a copy,  the  load  on  computer  i is 
Vv  + Hh/N  + G.. 

Notice  that  all  sites  are  assumed  to  have  equal  update  loads,  since  all 
sites  have  the  responsibility  of  keeping  their  copies  as  up  to  date  as 
possible.  The  response  time  for  a query  answered  local Iv  is  tluni 
R^  = a(Vv  + Hh/N  + G^  - e) , 

while  tiu'  response  time  for  a query  answc'red  at  remote  site  i is 

R.  = a(Vv  + Hh/N  + G.  - t)  + T , 

1 i n 


')7 


where  i ^ 1 . 


The  average  response  t inie  R Is  then 


s.i„i  + + ...  +„^,. 

The  quantity  of  interest  is  the  ratio 

k.L 

s 

If  R < 1,  response  time  is  improved  by  distributing  the  queries.  We 
tlierefore  would  like  to  obtain  some  idea  of  the  conditions  under  wliich 
R < I . 

Use  of  the  Model 

For  simplicity,  consider  the  case  N = 2.  Tlten 


R 


R,  + R„  G„  - G - Hh  + T /a 

1 2 ^ ^ 2 1 


2R 


2(Vv  + Hh  + G^  - Z) 


Ttie  denominator  of  the  second  term  is  always  positive,  by  our  assumption 
that  loads  are  large  enough  that  response  time  is  described  by  tiie 
steeply  rising  line.  Therefore  the  sign  of  the  numerator  determines 
whether  R is  greater  than  or  less  tiian  one.  Tliat  is,  we  have  the 
resu  1 1 : 


Distr ibut ion  of  the  queries  improves  response 
time  if  and  only  if 

aHh  + a(G  - G.)  > T . 

1 I n 

Now  tlie  parameter  a is  the  rate  of  increase  of  response  t ime  witli  respect 
to  load  - tlie  slope  of  the  response-time  curve.  Thus  tlie  left  siiie  ot 
the  above  inequality  is  just  an  increase  in  response  time  dtie  to  the 
query  load  and  the  load  differential  between  sites  1 and  2.  It  is  intu- 
itively reasonable  tiiat  when  this  quantity  becomes  greater  than  (tlie 
increase  in  response  time  ilue  to  network  delays  and  ovt'rhead),  it  (>.ivs 
to  distribute.  For  general  N the  inequal itv  becomes  hardly  more  complex; 


Distribution  improves  response  time  if 


and  only  if 

(T)  aHli  + aCC  - G)  > T , 

1 n 

wiiere  G is  tlie  average  load  at  the  remote  sites; 
i.e.,  G = (G.,  + G^  + ...  + Gj^)/(N  - 1). 

An  interesting  point  to  notice  is  tliat,  if  the  query  load  is  suffi- 
ciently large,  distributing  the  queries  may  improve  response  even  if  the 
local  site  is  less  heavily  loaded  than  the  remote  sites. 

Determination  of  the  parameter  values  to  use  in  this  model 
poses  a difficult  problem.  As  was  noted  earlier,  the  concept  of  load  is 

not  well  defined.  Values  for  the  G.  are  difficult  to  come  liv.  it  mav 

1 

be  possible,  however,  to  make  simple  assumptions.  For  example,  one 

could  assume  that  all  sites  are  approximately  equally  loaded.  Tn  this 

case,  inequality  (I)  becomes 

( I ' ) alili  > T . 

n 

At  tiiis  point  we  have  quantities  which  undoubtedly  can  be  measured. 

Even  though  we  don't  know  what  "load"  is  and  would  find  it  iiard  to 
determine  a and  h individually,  the  term  allh  can  be  determined  as 
follows.  Measure  the  response  time  R(Hj)  and  R(H,,)  for  two  different 
query  rates  Hj  and  H^.  Tlien,  assuming  tliat  the  system  is  suff  ic  it-m  ly 
heavily  loaded  so  that  these  points  fall  on  tlie  steep  linear  rise  of  tlie 
response-t  ime  curve  (this  point  can  be  cliecki'd  bv  further  measurements), 

R(H  ) - Rdl.,) 

Once  we  have  a good  estimate  for  ah,  we  can  estimate  allh  for  arbitrarv 


II.  Notice  that  this  same  approach  will  vield  estimates  of  the  lelt  sidi' 


of  the  inequality  above  even  if  we  are  not  measuring  a true  query  rate 

H,  but  only  some  parameter  H'  proportional  to  H.  If  the  network  is 

homogeneous,  T can  simply  be  measured  by  sending  off  some  queries  and 
n 

comparing  the  response  time  to  that  for  locally  handled  queries.  A data 
management  system  can  then  automatically  monitor  query  rate  and  response 
times  and  use  inequality  (!')  to  decide  when  queries  sliould  be  distributed. 
Generalizations 

Unequal  distribution  of  queries.  Suppose  that  the  queries, 
instead  of  being  divided  equally  among  N sites,  are  divided  arbitrarily, 
a fraction  w^  being  handled  by  the  ith  site.  Then 
N 

S w . = 1 
i=l  ^ 

and  the  appropriate  quantity  to  take  for  the  average  load  G at  the 
remote  sites  is  the  weighted  average 
N 

G = Z w.G./(l  - w ). 

1=2  ^ ^ 

Inequality  (I)  then  becomes 

N 2 

(11)  aHh(l  - Z w.  )/(l  - w,)  + a(G,  - G)  > T . 

. , 1 1 I n 

1=1 

Once  the  concept  of  distributing  the  query  load  unequally  among  the 
various  sites  is  introduced,  it  becomes  of  interest  to  study  optimi- 
zation of  the  distribution.  What  we  mean  by  optimization  is  the  deter- 
mination of  a set  of  weights  w, , w„,  ...,  w_,  such  that  R is  a minimum. 

12  N 

Let  us  consider  how  this  problem  can  he  solved  for  N = 2.  In  this  case 
w,  = 1 " Wj,  and  we  can  write  R in  terms  of  the  single  variable  Wp  the 
fraction  of  query  load  to  be  handled  locally.  In  detail. 


60 


R = w^R^  + (1  - Wj)R, 

= alw^(Vv  + WjHh  + C>2  ~ 

+ (1  - w^)  (Vv  + (1  - Wj)Hh  + - ?.)  i 

+ 0 - „j)T^ 

->  2 

= aVv  + aHh(Wj  + (1  - w^)  ) + w^CS^a 

+ (1  - w.)G„a  - ae  + (1  - w,)T  . 

i Z In 


T5 


Tl 


^ = aHh(4w,  - 2)  + a(G  - G.) 
JV'j,  1 1 Z 


T . 

n 


If  we  set  this  derivative  equal  to  zero,  we  find  that  there  is  a pro- 
spective extremum  at 


T - a(G,  - G„)  + 2aHh 
_ _n ^ 

'^1  4aHh  ’ 

a(G,  - G„)  - T + 2aHh 
12  n 

'''2  ~ 4aHh 

Since  the  second  derivative  (4aHh)  is  always  positive,  this  extremum  is 
in  fact  a minimum,  as  desired.  We  must,  however,  examine  another  con- 
straint - that  the  weights  and  m ^ must  both  be  positive.  We  can 
rewrite  w^  and  w.,  ns 


The  we  i gilts  w^ 
conditions;  for 
Some 


'^12  ' 4aHh''" 


T - a(G,  - G„) 
n 1 2 


4allh 


and  w^  can  he  seen  to  be  positive  umier 
example,  if  G^  = G^  and  inequal  itv  ( I ' 1 


interi'Sting  conclusions  can  immediatelv 


a wide  range 
ho  Ids. 

be  read  1 rom 


of 


t hi’ 


equations  for  w and  w,. 


First,  we  note  that  if  the  loads  are  iqu.il 


(G^  = G^)  the  local  site  should  always  handle  more  than  half  of  the 

queries.  Only  when  = a(G^  - G^) , so  that  the  network  delay  equals 

the  increase  in  response  time  due  to  load  differential,  should  the  query 

loads  be  equalized.  And  only  when  is  less  than  a(G^  - G2)  should  the 

local  site  send  off  more  queries  than  it  keeps. 

It  must  again  be  emphasized  that  careful  measurements  are 
required  for  these  relationships  to  be  useful  for  real  decision  making. 
Tt  is  easy  to  estimate  that  T^,  aHh,  and  a(G^  - C^)  may  all,  under 
reasonable  assumptions,  be  on  the  order  of  one  to  two  seconds.  This 
information  is  not  at  all  helpful  for  developing  long-term  strategies, 
but  merely  demonstrates  that  the  optimum  decision  on  query  sharing 
should  be  done  dynamically  and  only  after  monitoring  current  system 
usage  and  response. 

The  above  analysis  for  optimum  distribution  strategy  was 
done  for  the  N = 2 case.  The  general  case  can  be  handled  similarly, 
but  is  more  complicated  because  of  the  multi-variable  minimization. 
Setting  the  derivatives  to  zero  and  solving  yields  the  following 
equations  for  i ^ 1. 


Usage  from  various  sites.  All  of  the  analysis  so  far  lias  been 


under  the  assumption  that  tlie  query  load  all  originates  at  a single 
site.  Suppose  instead  that  each  site  i generates  some  fraction  f.  of 
the  total  query  load  11.  Site  i then  distributes  its  query  load  with  a 
strategy  described  by  wc’ights  w(i)j,  w(i)^,  ...,  w(i)^.  The  net  rate  of 
input  of  queries  that  a site  i must  respond  to  is  then  given  by 

H.  = E f Hw(i)j, 
i 

so  that  site  i's  response  time  (i.e.  time  to  respond  to  a query)  is 

R.  = a(Vv  + H .h  + G . - II)  . 

1 1 1 

From  the  point  of  view  of  site  j,  the  average  response  time  seen  is 
computed  as 

R.  = E w(i) .R.  + (1  - w(i) ,)T  , 
i . It  in 

since  a network  deluv  of  T is  observed  for  the  fraction  of  queries 

n 

answered  remotely.  Now  to  get  an  average  response  time  for  querit's 
originated  throughout  the  network,  we  must  take  another  weighted 
average : 

R = E f .R. . 

,i  J 

Combining  the  preceding  four  equations,  we  get  an  equation  for  R in 
2 

terms  of  the  N“  variables  w(j)..  As  abov'e,  we  can  carrv  out  an  opt  imi- 
tation analysis  or  compare  various  strategies.  (For  cxamiiic,  t lu'  strati-gv 
where  i-ach  site  handles  its  own  queries  is  di-scribi-d  bv  w(j),  = 1 when 
i = i and  w(j).  = 0 ot  lu'rw  i sc> . ) We  will  not  go  intai  I'urtlur  details  on 
this  goner al izat  ion  in  this  brief  report. 

I’ropostvl  further  ^^enera  1 i za  t ions . We  list  here  several  ot  her 
wavs  in  whiih  assumptions  mav  bi'  relaxed  and  t be  modi'l  made  mi're  flexible. 
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1)  We  have  assumed  that  T is  a constant.  To  be  realistic,  T 

n n 

should  depend  upon  the  two  sites  between  which  the  messages  (query  and 

response)  are  traveling.  Thus  we  need  to  insert  into  the  model  a set  of 

values  T (i,i).  In  addition,  the  values  of  T (i,j)  mav  varv  depending 
n n 

upon  what  routes  are  taken  - but  it  is  undoubtedly  adequate  to  take 
average  values.  Finally,  the  values  of  T^(l,j)  will  vary  with  the 
amount  of  network  traffic.  In  particular,  if  we  assume  that  query 
traffic  forms  a non-negl  igible  percent  of  net  traffic,  T^(i,,i)  will  be 
some  function  of  H.  Theoretical  analysis  (e.g.  by  queueing  theory)  can 
probably  be  used  to  determine  this  function,  wliich  will  depend  on  network 
parameters  as  well  as  on  H and  the  d ist r it  at  ion  strategy. 

2)  We  have  assumed  that  the  parameters  a and  5.,  which  describe 
response  time  as  a function  of  "load,"  are  the  same  for  all  sites.  This 
assumption  is  not  true  in  a heterogeneous  network,  or  for  a network  of 
dissimilarly  configured  "homogeneous"  hosts  (e.g.  the  PWIN).  It  shtnild 
be  noted,  however,  that  the  parameter  1 did  not  enter  into  any  of  the 
decision  relations,  except  in  the  assumption  that  "load"  must  he  large 
enough  compared  to  ?,  so  that  the  linocar  expression  for  response  time 


holds.  In  addition,  we  have  seen  that  a is  measured  only  as  it  occurs 
combined  with  other  factors.  That  is,  differences  in  a mav  be  taken 


into  account  by  varying  the  11..  (See  preceding  section.)  Tims  tlu' 
practical  impact  on  the  model  of  allowing  a and  ! to  .'arv  from  s i 1 1'  to 


site  is  probably  minimal. 

5)  We  have  assumed  that  all  sites  have  an  equal  l<vid  asso- 
ciated with  updating  the  data  base.  This  will  not  in  geni'ral  be  triu’. 
If,  sav,  the  updates  all  originatf'  at  Site  1,  the  other  sites  will  all 


incur  network  overhead  in  processing  the  uptlates.  On  tin-  ('t  her  hand 


Site  I may  do  much  preprocessing  Co  make  the  update  task  simpler  at  the 
remote  sites.  Details  of  this  sort  could  be  built  into  the  model. 
Results  can  not  be  too  different,  however,  since  site  differences  in  tlie 
terms  Vv  can  always  be  subsumed  in  the  . 

I 

4)  We  havt'  assumed  that  N sites  in  tin*  network  havt-  up-to-date 
copies  of  the  data  base  and  the  problem  is  to  determine  ,i  strategy  fm' 
distributing  tlu!  queries  as  they  are  entered  into  tlie  syst<.>m.  A some- 
what  different,  lu.it  very  similar,  model  is  ricuHied  to  study  the  problem 
of  setting  a policy  for  distributing  the  data  base  - i.e.,  for  deter- 
mining which  sites  should  have  copies.  In  this  case  a careful  analysis 
of  the  effects  i>f  updates  will  be  essential.  The  sites  at  which  updates 
origintite  will  find  their  loads  increased  by  the  necessity  of  distri- 
buting the  updates  to  other  sites  having  a copy  of  the  data  base. 

Remote  sites  holding  a copy  of  the  data  base  will  all  have  increasiul 
loads  due  to  the  processing  of  updates  and  associated  network  ovi'rhcad . 
Str.itegies  for  synchronization  ami  aspects  of  multi-copy  management  will 
affect  loads  and  hence  response  times.  Thesi*  effects  will,  of  course, 
implicitly  enter  into  the  query  distribution  model,  but  there  we  assumed 
that  response  times  were  obtained  by  direct  system  measurement,  so  that 
lower-leyel  details  were  not  modeled,  but  included  in  misisuri'd  para- 
mi'ters.  To  determine  a distribution  policy  a pri('fi  rec]uires  modeling 
these  lower-lev('l  effects  and,  if  possible,  optimizing  the  lower-levi I 
strategies  fe.g.  for  s vncli  ron  i za  t i on ) so  that  the  pi'lic\’  is  based  on  the 
bi'St  up-to-dati'  technology. 

a)  The  four  proposed  gi'iierali/at  iiuts  listeil  above  invelv> 
relativi'lv  s t ra  i ghi  t (Uwa  ril  extensions  of  the  ap]>n'ach  deiuribeii  in  this 
report.  In  aihlition,  howi'ver,  the  ap]ir('ach  it'ull  tu'eds  i lives  t i g.a  t i on . 


That  is,  we  have  assumed  that  response  time  is  a simple  linear  function 
of  "load,"  and  that  "load"  can  be  described  as  a simple  linear  combina- 
tion of  updates,  qiU’rios,  etc.  It  should  be  possible  to  examine  these 
assumptions  - both  by  measurement  and  by  stochastic  queueing  analvsis. 

A c-areful  examination  of  this  type  is  expected  to  yield  some  refinements 
in  the  relatiiins  studied  here. 
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